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Abstract 

We propose a geometric approach for bounding average stopping times defined in terms 
of sums of i.i.d. random variables. We consider stopping times in the hyperspace of sample 
number and sample sum. Our techniques relies on exploring geometric properties of continu¬ 
ity or stopping regions. Especially, we make use of the concepts of convex hull, convex sets 
and supporting hyperplane. Explicit formulae and efficiently computable bounds are obtained 
for average stopping times. Our techniques can be applied to bound average stopping times 
involving random vectors, nonlinear stopping boundary, and constraints of sample number. 
Moreover, we establish a stochastic characteristic of convex sets and generalize Jensen’s in¬ 
equality, Wald’s equations and Lorden’s inequality, which are useful for investigating average 
stopping times. 


1 Introduction 


In many areas of engineering and sciences, especially probability and statistics, it is interested to 
investigate the expectation of stopping times defined in terms of sums of i.i.d. random variables. 
For example, a frequent topic of random walk E da m, EH concerns a stopping time which is 
the smallest positive integer n such that X\ + • • • + X n is no less than /(n), where X\, X 2 , ■ ■ ■ 
are i.i.d. random variables and / is a function of re. Since many sequential hypothesis testing 
and estimation procedures can be cast into the context of such stopping time, for analyzing the 
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efficiency of statistical inference, it is of practical importance to evaluate the expectation of such 
stopping time in the area of sequential analysis [a ei nni mi na hhi [28U29] . Although the literature 
on such stopping time is abundant, most existing works are focused on the asymptotic analysis of 
average stopping times (see, e.g., [26} ED] and the references therein). Existing techniques such as 
Lorden’s inequality HU for bounding average stopping times are limited to very specific forms of 
f(n). In many practical situations, /(n) can be complicated functions without nice properties such 
as linearity and nronotonicity. The sample number n may be restricted to a subset of natural 
numbers, as usually required in group sequential methods mnanaiaiiaa. The underlying 
variables Xi may be random vectors. However, there lacks of effective technique for obtaining 
tight bounds for average stopping times, which are general enough to deal with the nonlinearity 
of the function f(n), the constraint of the sample number n, and the dimensionality of random 
variables X\. X 2 , • • •. Motivated by this situation, we propose a geometric approach to bound 
average stopping times in a general setting. We consider stopping times in the hyperspace of the 
tuple (n, X\ + • • ■ + X n ), where X t are allowed to be random vectors and n is a pre-specified subset 
jV of natural numbers. A stopping time is represented as the first time n € JV that the tuple 
(n,X ! + ••• + X n ) falls into a certain region, referred to as a stopping region (or equivalently, 
falls outside of a certain region, referred to as a continuity region). Our main idea is to make use 
of the geometric properties of the continuity region or stopping region. Particularly, we will use 
concepts such as convexity and supporting hyperplane to develop bounds for average stopping 
times, which are either explicit or amenable for convex minimization. 

The remainder of the paper is organized as follows. In Section 2, we propose to investigate 
stopping times in a geometric setting, which makes it possible to use geometric concepts such as 
convex hull, convex set, and supporting hyperplane, etc. Afterward, we establish a probabilistic 
property of convex sets, which plays a crucial role in bounding average stopping times. In Section 
3, we generalize Jensen’s inequality, Wald’s equations and Lorden’s inequality, which are funda¬ 
mental tools for investigating average stopping times. In Section 4, we establish efficient convex 
minimization techniques for bounding average stopping times. In Section 5, we develop explicit 
formulae for bounding average stopping times by virtue of the concept of supporting hyperplane. 
In Section 6, we propose to bound average stopping times by combining the power of concentra¬ 
tion inequalities and the concept of geometric convexity. In Section 7, we extend the techniques to 
bound average stopping times relevant to continuous-time stochastic processes such as Brownian 
motion. Section 8 is the conclusion. Most proofs are given in Appendices. 

In this paper, we shall use the following notations. The set of positive integers is denoted by 
N. The set of non-negative integers is denoted by Z+. The set of real numbers is denoted by M. 
The set of non-negative real numbers is denoted by M + . The set of real-valued row matrices of 
size 1 x d is denoted by M. d . A row matrix in M d is also called a vector. The notation 0^ denotes 
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a row matrix of size 1 x d with all elements being 0. The notation Id denotes a row matrix of 
size 1 x tl with all elements being 1. We use notation T to denote the transpose of a matrix. We 
define the following operations of row matrices: 

AB denotes the product of A = [ai,--- , a<j] and B = [6i, - - ■ ,bd] in the sense that AB = 

[ai&i, • • • , ddbd]- 

denotes the quotient of A = [a \, ■ ■ ■ , a^] divided by B = [&i, • ■ • , bd] in the sense that 

A _ [ai Odl 

B L 6i > > b d i' 

A 1 denotes the i-th power of A = [a \, • • • , ad] in the sense that A 1 = [a \, • • • , a d ]. 

| A\ denotes the absolute value of A = [ai, • • • , ad] in the sense that | A| = [|ai|, • • • , |ad|]. 

For matrices A = [ai, • • • ,ad] and B = [b\, ■ ■ ■ , bd], we write A < B if ai < bi for * = 1, • • • , d. 

For matrices A = [ai, • • • , ad] and B = [6i, - • • ,bd], we use < A, B > to denote their inner 

product, that is, < A, B > is equal to Yli=i a A- 

For A = [ai, • • • , a^] € M rf , its L p -norm with p > 1 is defined as 

u\\ r =(Y.\a<\ r 

\i= 1 

The L p -norm of the transpose of A is also defined as ||A|| p , that is, ||A T || p = ||A|| p . 

For a function, f(v), of v = [ui, • • • , Vd] G we use to denote the gradient of f(v) with 
respect to v, that is, 

df(v) _ df(v) df{v) ~ 
dv dv\ ’ ’ dvd 

For random vector X = [*!,••• , a;^], we define X + = [max(0, aq), • • • , max(0, Xd)\ as the non¬ 
negative part of X. Similarly, we define X~ = [max(0, —xi), ■ • • , max(0, —Xd)\ as the non-positive 
part of X. 

For a set its closure and boundary are denoted by 5? and respectively. The probability 
of an event E is denoted by Prl-E}. The mathematical expectation of a random variable (scalar 
or vector) X is denoted by E[X]. Let I e denote the indicator function such that it assumes value 

1 if the event E occurs and it assumes value 0 otherwise. The other notations will be made clear 
as we proceed. 

2 Stopping Times and Convex Sets 

In this section, we shall propose to investigate stopping times with their geometric representa¬ 
tions. We shall also establish a connection between stopping times and convex sets. A stochastic 
characterization of convex sets is developed. 
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2.1 Geometric Representations of Stopping Times 


Existing methods for bounding the average of a stopping time typically focus on exploring the 
properties of the function defining the stopping time. Consider, for example, the stopping time 


mentioned in the introduction of this paper. To bound the expectation of stopping time 


N = inf{n € N : X x + • • • + X n > /(n)}, 


( 1 ) 


conventional wisdom is to explore the function f(n) for properties such as linearity and mono¬ 


tonicity which could be useful for bounding the average stopping time. We would like to point 
out that the methods in this direction usually fail to fully exploit the geometric information of 
the underlying continuity or stopping regions. To clearly address this point, we shall first provide 
geometric representations of stopping times in the sequel. 

Throughout the remainder of this paper, we shall use the following notations and definitions. 
Let 0 < iV 0 < iVi < iV 2 < • • • be an increasing sequence of integers and define JY = {N \, N 2 , ■ ■ ■ }. 
Let Si be a subset of {(t, s) : t € R + , s € M^}. Let X = faq,--- , as^] be a d-dimensional real¬ 
valued random vector with mean /1 = E[X]. Let X 2 , • • • be i.i.d. samples of X. Define So = 0 
and 



for n € N. Our effort will be devoted to stopping times which are defined in terms of sample sum 


S n (or equivalently, sample mean X n ), the region Si and the set JY. The stopping times defined 
in this way can be fairy general. 

A stopping time can be defined in terms of S n as N = inf{n € jV : ( n,S n ) € Si}. Such 


stopping time is associated with the stopping rule: Continue observing S n until (n, S n ) € £% for 


some n € JY. For such stopping rule, the region Si is referred to as a stopping region. 

On the other hand, a stopping time can also be defined as N = inf{n € JY : ( n,S n ) ^ Si}. 
Such stopping time is associated with the stopping rule: Continue observing S n until (n, S n ) ^ Si 
for some n € JY. For such stopping rule, the region Si is referred to as a continuity region. 

Despite the generality of the above geometric representations, stopping times are usually 
expressed in algebraic forms. A familiar example is the stopping time defined by (JT]). In this 
paper, we propose to investigate stopping times based on their geometric representations. The 
primary reason is that the bounding of average stopping times can be much more easier by 
exploiting the geometric properties of the underlying continuity or stopping region Si. As will 
be seen later, this is especially true when the closure of the region Si is convex. We discovered 
that, for a wide variety of stopping times in the context of sequential hypothesis testing and 
estimation, the corresponding continuity or stopping regions Si in geometric representations are 
actually convex. In the worse case that the continuity or stopping regions Si are not convex, it is 
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still possible to bound the average stopping time by replacing with its convex hull IZ, at the 
price of extra conservatism. 

To illustrate the advantage of geometric representations, consider stopping time 

N = inf{n € N : f(n, 5„) > 0}, 

where f(t, s ) is a bivariate function of t £ M + and s 6R, Clearly, the stopping region is 

M = {(f, s) : t £ M + , s £ M, f(t, s ) > 0} (2) 

and the stopping time N = inf{n £ N : (n, S n ) € Similarly, the continuity region is 

£% c = {(t, s) : t £ M + , s £ R, f(t, s) < 0} (3) 

and the stopping time N = inf{n £ N : (n,S n ) ^ & c }. It can be shown that if / is a concave 
function, then the stopping region (J2J) is convex. If / is a convex function, then the continuity 
region Q is convex. It is important to note that the convexity of the stopping or continuity 
region may also hold in situations when the function / is neither convex nor concave. Moreover, 
even if neither the continuity region nor the stopping region is convex, we can still bound the 
average stopping time by using their convex hulls. This example demonstrates that, in contrast 
to using algebraic forms of stopping times, it is possible to exploit the convexity of the continuity 
or stopping regions in geometric representations under much weaker conditions. 

2.2 A Stochastic Characteristic of Convex Sets 

As discussed in the last subsection, there exists a useful connection between stopping times and 
convex sets. Since continuity or stopping regions are convex in many situations, it is natural 
to consider the question of under what conditions the expectation of a random vector will be 
contained by a convex set. Our investigation indicates that if a set in a finite-dimensional Eu¬ 
clidean space is convex, then the set contains the expectation of any random vector almost surely 
contained by the set. Conversely, if a set in a finite-dimensional Euclidean space contains the 
expectation of any random vector almost surely contained by the set, then the set is convex. 
More formally, we have established the following results. 

Theorem 1 If @ is a convex set in M n , then E[A] € @ holds for any random vector X such that 
Pr{ X £ S>} = 1 and that E[A] exists. Conversely, if *3) is a set in M n such that K[X] € £> holds 
for any random vector X such that Pi{X £ @>} = 1 and that K[X] exists, then 3) is convex. 

See Appendix [A] for a proof. This theorem plays a fundamental role in our approach for 
bounding average stopping times. 
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We would like to point out that the first assertion of Theorem Q] provides a simple proof of 
Jensen’s inequality. To see this, note that if a function is convex, then its epigraph, the region 
above its graph, is a convex set. Hence, if / is a convex function, then for any random variable X, 
since (X, f{X)) is contained by the epigraph of /, it follows from Theorem Q] that (E[X], E[f(X)]) 
is contained by its epigraph. This implies that E[/(X)] > f(E[X]) by the notion of epigraph. 


3 Generalizations of Jensen’s Inequality, Wald’s Equations and 
Lorden’s Inequality 

In this section, we shall generalize Jensen’s inequality, Wald’s equations and Lorden’s inequality, 
which can be useful for evaluating average stopping times. 


3.1 Generalization of Jensen’s Inequality 

For investigating the convexity of continuity and stopping regions, we have the following results. 


Theorem 2 Suppose that g(z) is a multivariate convex function of z G where S> is a convex 
set. Define f(t, s ) = tg for t ^ 0 and s such that | € & ■ Then, f(t , s ) is a multivariate convex 
function oft > 0 and s such that | € *2). Similarly, f(t,s) is a multivariate concave function of 
t < 0 and s such that | € . 

See Appendix [B] for a proof. As an application of Theorem [21 consider stopping time N = 
infjn € JV : /(n, S n ) > 0}, with 

k 

f{n,S n ) = 

e=i 

where gi, hi are multivariate convex functions, Ai € M. d , and /3i € M. The convexity of the 
continuity region follows immediately from Theorem [2j 

By virtue of Theorem [21 we have derived the following results. 


(AjS, 
ngt I- 


n 


+ hi(n,S n ) 


7 E\Z] 

Theorem 3 Let Z be a random vector and Y be a scalar random variable such that y and gp4 
are contained in a convex set S>. Assume that g(z) is a multivariate convex function of z € . 

Then, 


E 



E 




> E [Y]g 
< E [Y]g 


’E[Z]\ 
E [Y\J 

gtgn 

E [Y]J 


ifY is a positive random variable', 
ifY is a negative random variable. 


See AppendixOfor a proof. It should be noted that Theorem [3] generalizes Jensen’s inequality. 
In the special case that Y = 1, the first inequality of Theorem [3] reduces to Jensen’s inequality. 
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3.2 Generalization of Wald’s Equations 

Making use of Theorem [3l we have generalized Wald’s equations [29] as follows. 


Theorem 4 Let X \, X< 2 , ■ ■ ■ be i.i.d. samples of random vector X with mean g = E[X] and 
variance v = E[|X — g | 2 ]. Assume that N is an integer-valued random variable such that E[iV] < 
oo and that for any possible value n of N, the event {N = n } depends only on X\. - ■ ■ ,X n . 
Define 


N 


Sn = ^ Wj, 


y S N 

Xn = w 


Vn 


(S N - Ng) 2 

N 


i —1 

The following assertions hold. 

(I): If g is a convex function on a convex set 3> in M d such that S> contains g and the range 
of Xn, then 


E [Ng(X N )] > E[N]g(g). 


(II): If g is a convex function on a convex set 3> in M d such that & contains v and the range 
of Vn , then 


IE [Ng (Ejv)] > E[N]g(v). 


See Appendix [D] for a proof. To see why the inequality in the first assertion of Theorem 
[4] is a generalization of Wald’s first equation, consider function g(x ) = x. By the convexity of 
g(x), we have E [iVXjv] > E[W]/i. On the other hand, by the convexity of —g(x), we have 
E [iV(—X jv)] > IE[iV] (—/lz) or equivalently, E [iVA]v] < E[iV]/u. Hence, it must be true that 
E [Sn] = E [iVXjv] = E[7V]/i, which is Wald’s first equation. Similarly, we can demonstrate that 
the inequality in the second assertion of Theorem [4] is a generalization of Wald’s second equation. 
As an illustration of the applications of Theorem |4j consider 

N = inf in € jV : n > -L g(X n ) > 0 
l g{X n ) 

Clearly, 

-IV > — , g(x N ) > 0 

g(x N ) 

and thus Ng(XN) > 1 almost surely provided that E[JV] < oo. By using the generalization of 
Wald’s first equation, we have E[iV]g(/r) > 1, which implies the following result. 



Theorem 5 Assume that g is a concave function on a convex set & in W l such that contains 
g and the range of Xn and that g(g) > 0. Then, E[7V] > 
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As another application example of Theorem^ consider L p -norrn function g(s) = ||s|| p , where 
s € M. d and p > 1. As a consequence of the absolute homogeneity and subadditivity of the 
LP- norm, we have 

\\px + (1 - p)y\\ p < \\px\\ p + ||(1 - p)y\\ p = p\\x\\ p + (1 - p)\\y\\ p 

for arbitrary vectors x, y € W l and p € [0,1]. Hence, the L p -norm function g(s) = ||s|| p is a 
convex function of s € W l . Applying Theorem [4] to the L p -norrn function g(s) = ||s|| p , we have 
the following results. 

Theorem 6 Let X \, be i.i.d. samples of random vector X with mean p = E[A] and 

variance u = E[| X — p\ 2 ]. Assume that N is an integer-valued random variable such that E[1V] < 
oo and that for any possible value n of N, the event {N = n} depends only on X\,--- . X n . 
Define Sn = an d Vn = (Sn ~ IV p) 2 . Then, 

E[||^|| p ]>E[iV]|H| p , 

IE [||Vjv|| p ] > E[IV] \\u\\ p 

for all p > 1. 

3.3 Generalization of Lorden’s Inequality 

In order to obtain tight bounds for stopping times, we need to generalize Lorden’s inequality [13- 
In this direction, we have obtained the following result. 

Theorem 7 Let Z\, Z%, ■ ■ ■ be i.i.d. samples of random variable Z such that E[Z 2 ] < oo. Assume 
that X is a random variable independent of Zi for all i€N. Define = inf {n e N : Y^= i Zi 2S A} 
and R\ = Zi — A. Then, 

e[jRa] - E[ e[z{ ] Pr{z < A} + E[(z - A ) + l - 

See Appendix [E] for a proof. 

In the following, we have extended Lorden’s inequality to the case that the increment of sample 
sizes is not a constant. 

Theorem 8 Let Z\, Z 2 , ■ ■ ■ be i.i.d. samples of positive random variable Z such that E[Z 2 ] < 00 . 
Assume that X is a random variable independent of Zi for all i € N. Define = inf{n € JV : 
Y^i=\ Zi > A} and R\ = Zi~X. Define Y = Z\+- ■ ■+Z^ 1 and K = max{A^ + i —N^ : l € N}. 
Then, 

E[I?a] < ((A' - 1)E[Z] + ®) Pr{T < A} + E[(Y - A)+]. 

See Appendix [F] for a proof. 
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4 Bounding Average Stopping Time via Convex Optimization 


In this section, we shall demonstrate that the general problem of bounding average stopping times 
can be converted into problems of convex minimization, which can be readily solved by modern 
optimization theory and algorithms. In some particular cases, it is possible to obtain explicit 
bounds for average stopping times. 

4.1 Lower Bound on Average Stopping Time 

Consider stopping time N = inf{n € jV : (n, S n ) € where the set is called a stopping 
region. We have the following results on the average stopping time. 

Theorem 9 Suppose that the stopping region £% is a convex set. Define = {t € R + : (t, tfi) € 
^}. Then, E[IV] > min ,o/ provided that is not empty. Moreover, E[IV] = oo provided that sY 
is empty. 

See Appendix iGl for a proof. 

4.2 Upper Bounds on Average Stopping Time 

As mentioned before, a general problem is to bound the stopping time 

N = inf {n € jY : (n, S n ) £%\ , (4) 

where ^ is called the continuity region. For the boundedness of E[A], consider the following 
assumptions: 

(I) There exist numbers A > 0 and K such that 


-A^+i ^ AAfy + K 


(5) 


for all £ > 0. 

(II) Either limsup^oo ( Ni + \ — Nfi) < oo or liminf^oo > 1. 

(HI) & is a convex set containing (0, 0^). 

(IV) {(Vo, Sn o) £ is a sure event. 

(V) There exists a unique positive number m such that (m, mp) £ dffl. 

(VI) Each element of E[| A| 3 ] is finite. 
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It should be noted sample sizes used in group sequential methods 0 El QH123 [30] typically 
satisfy the inequality Q for some numbers A > 0 and K. For the stopping time defined by 0), 
we have established the following result. 

Theorem 10 If assumptions (I) -(VI) are fulfilled, then E[IV] < oo. 

See Appendix [H] for a proof. 

For the purpose of bounding E[IV], define 

M = sup {N £ :leZ + , N> N £ }. (6) 

If E [N] < oo, then it must be true that Pr{IV < oo} = 1. Let I be the index at the termination of 
the sampling process. This implies that t is a random variable such that N = and M = Nt—i. 
It should be noted that M is not a stopping time and thus E[5 m] is, in general, not equal to 
E [M]fa. In other words, Wald’s first equation [2S] is not applicable to Sm , although it holds for 
Sn- 

Clearly, as a consequence of the definition of M and assumption (I), we have N < XM + K 

and 

E[IV] < AE [M] + K. (7) 

In view of 0, to bound E[7V], it suffices to bound E[IW], For this purpose, we have the following 
general result. 

Theorem 11 If assumptions (I) - (VI) are fulfilled, then 

E[IW] < max t, (8) 

where = {(f, s') € £% : ta + ( < s < t/3 + r]}, with 

a = p-XE{(X-p)+], p = ti + XE[{X-n)-], ( = (N 0 -K) E[(X - p)+], r, = (K - N 0 ) E[(X - p)~]. 

See Appendix Q] for a proof. It can be checked that is a convex set. Moreover, rnaX( 4 s ) g ^t = 
— min (t, s )£@ f(t,s), where f(t,s) = —t is a convex function of (t,s) contained in the convex set 
Q). Therefore, the upper bound in (0 can be readily evaluated by convex minimization. With 
recent improvements in computing and in optimization theory, convex minimization is nearly as 
straightforward as linear programming (see, e.g., [I] for a comprehensive treatment). Convex 
minimization problems can be solved by contemporary methods such as subgradient projection 
methods [22], interior-point methods [2T] , etc. 

In the case that A is a bounded random vector, we have the following result. 
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Theorem 12 Suppose that Pr{a < X < b} = 1, where a,b G M. d , and that assumptions (I) - 
(V) are fulfilled. Define 


(g-a)(b-g) 

b — a 


a = g — Xv, j3 = g + Xv , £ = (iVo — A") v , 


v = -£ 


and 


ol = b + A(/i — 6 ), 


ft = a + A(^ — a), £ 7 = K(g — b), ft = K(g — a). 


Then, 


where *2) 


E[Ml < max t, 

{(t, s) € ^ : ia + £ < s < t/3 + 77 , to' + £ ; < s < tft + ft}. 


(9) 


See Appendix [I] for a proof. 

In many situations, a stopping time is defined in terms of sample mean. Consider stopping 
time 

N = inf{n e dT : n > g(X n )}. (10) 

For such stopping time, we have the following result. 

Theorem 13 Assume that g is a concave function on a convex set D in M d such that g is 
an interior point of D and that the range of X n is contained by D for any n € {IVo,iVi, • • •}. 
Assume that K and Nq are positive integers such that { Nq < g(Xjv 0 )} is a sure event and that 
N e+1 — Ni < K < Nq for i = 0,1, 2, ■ ■ • . Assume that each element o/E[|A| 3 ] is finite. Then, 

ELZV] < K + max g(d), 

where S) = {6 £ D \ a < 0 < (3} with a = g — E[(X — g) + \ and j3 = g + E[(X — g)~]. 

See Appendix |K] for a proof. 

If JT = N, the stopping time defined by (jlOl) becomes N = inf{n G N : n > g(X n )}. For such 
stopping time, we have the following result. 

Theorem 14 Assume that g is a non-negative concave function on a convex set D in M. d such 
that g is an interior point of D and that the range of X n is contained by D for any n G N. 
Assume that each element o/E[|A| 3 ] is finite. Then, 

EL/Vl < 2 + max g(6), 

~ 06 ® 

where = {8 G D : a < 9 < (3} with a = g — E[(A! — g ) + ] and f3 = g + E[(A — g)~]. 
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See Appendix [I] for a proof. In the case that A is a bounded random vector, we have the 
following result for the stopping time defined by (1101) . 

Theorem 15 Assume that Pr{a < A < b} = 1, where a,b £ W l . Assume that g is a concave 
function on a convex set D in such that y is an interior point of D and that the range of X n 
is contained by D for any n € {No,N±, • • •}. Assume that K and Nq are positive integers such 
that {No < g(X;\r 0 )} is a sure event and that Ay +I — Np < K < N 0 for l = 0,1, 2, • • • . Then, 

EL/Vl < K + m&xg(d), 

where @> = {6 £ D ■. y — v < 9 < y + v} with v = . 

See Appendix [M] for a proof. 


5 Bounding Average Stopping Time with Supporting Hyper¬ 
plane 


In this section, we shall establish explicit bounds for average stopping times. Consider the stopping 
time N defined by (JU). In view of (O, to bound E[IV], it suffices to bound E [M]. For this purpose, 
we propose to use the concept of supporting hyperplane to derive explicit bounds for E [M\. In 
the sequel, we shall use equation As + Bt = C, where A T € W l , B£M and C € M are constants, 
to represent a hyperplane which consists of points ( t , s ) with ( £ K and s € satisfying the 
equation. To bound E[IW] associated with (01), we have the following result. 

Theorem 16 Suppose that assumptions (I) - (VI) are fulfilled. Let As + Bt = C , where C > 
0, be the supporting hyperplane of passing through ( m , my) € dS% . Let 'V = {q € : 

each element of q assumes value 0 orl}. Define 

a = »-W[(X-»)+], p = g + \E[(X-p,)-}, (=(N 0 -K)E[(X-y) + ), n = (K - N 0 ) E[(X - y)~]. 


Then, 


E[Af] < max 


C - A(q + q(( — rj)\ 


: q € X 


B + A[(5 + q(oi —, 

provided that the minimum of {B + A[/3 + q(a — /?)]: q € is positive. 


See Appendix INI for a proof. 

From the above theorem, it can be seen that the minimum of {B + A[/3 + q(a — /3)] 
is close to the positive quantity Ay + B if a and (3 are close to y. Actually, this frequently occurs 
in practices. 

In situations that A is a bounded random vector, we have obtained explicit bounds for E[Af] 
in connection with the stopping time N defined by @ as follows. 
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Theorem 17 Suppose that Pr{a < X < b} = 1, where a,b E M''*, and that assumptions (I) - (V) 
are fulfilled. Let As + Bt = C, where C > 0, be the supporting hyperplane of Si passing through 
(m,mp) E dSi. Let "V = {q E M. d : each element of q assumes value 0 or 1}. Define 
(p-a)(b- p) 


v = 


b — a 


a = p — Xv, fi = p + Xv, C = (Nq — K)v, rj = — £ 


and 


Then, 


a' = b + X{p — b), ft = a + X(p — a), (' = K(p — b), rj' = K(p — a). 


E[IW] < max 


C-A[r} + q (C - V)] 


: q E y 


B + A[(3 + q{cz — , 

provided that the minimum of {B + A[/3 + q(a — (3)] : q £ y} is positive. Similarly, 

C-A[r/ + q(C~ rf)] 


E[IW] < max 


: q E y 


B + A[p> + q{a' - /3')\ 
provided that the minimum of {B + A[j3’ + q(a' — j3’)\ : q E y} is positive. 

See Appendix O for a proof. 

In the sequel, we shall apply the concept of supporting hyperplane and Lorden’s inequality on 
overshoot to obtain explicit bounds for average stopping times. Consider stopping time 

JV = {n£N:^GN, (n,S n )^}, 

where K is a positive integer. For such stopping time, we have the following results. 

Theorem 18 Assume that & is a convex set containing (0,0^). Assume that each element of 
E[|X| 2 ] is finite. Assume that there exists a unique positive number m such that (m,mp) € dSi. 
Assume that there exists a supporting hyperplane As + Bt = C, where C > 0, of Si passing through 
(m, mp). Define Z = BK + A A*. The following assertions hold true. 

(I): 

E[IV] < m +l(^) 2 E[(Z + ) 2 ] 


< m + K + 


m\ 


E 


CJ 
rm \ 2 

< m + K + J 


(MX - p)f 


2 xE[\\X-p\\ 2 ] 


(II) : If the elements of X are mutually independent, thenE[N] < m+K+(^ 2 A 2 E [{X—p) 2 ]. 

(III) : //Pr{a < X < b} = 1, where a,b E R rf , then 

Km{u + v ) Km 2 uv 


E[IV] < m + 


C 


c 2 


where u = \ [A(a + b) + |A|(a — b)] + B and v = 4 [A(a + b) + |A|(6 — a)] + B. In particular, 


E[JV]<m+g(9) 2 (g- !1 )/ OrU <0. 
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See Appendix [P] for a proof. 

To illustrate the applications of Theorem ITH1 consider stopping time 


N = inf{n € N : (n, S n ) ^ 3$}. (11) 

We wish to apply assertion (I) of Theorem 1181 to derive convenient upper bounds for E[1V]. For 
this purpose, we define function 


g(v) = sup{f G M + : (t,tv) G 3%} 


( 12 ) 


for v G M. d . Note that g(v) can be oo. Let V(u) denote the gradient of ln(g(u)) with respect to v, 
that is, 


V (v) = 


dln(g(v)) 1 dg(v) 


dv g(v) dv 

For the stopping time defined by dill) , we have the following results. 


(13) 


Theorem 19 Assume that 3% is a convex set containing (0,0^). Assume that E[||AT|I 2 ] is finite. 
Assume that g(y) is differentiable at a neighborhood of v = / 1 . Then, 


E[AT] < g(g) + 1 + E (V(/x), X — g) 2 < g(g) + 1 + ||V(^)||^ x E 


IX - 


See Appendix [Q] for a proof. We can apply Theorem [T9] to derive a simple bound for the 
expectation of the first passage time for a random walk with concave boundary. More specifically, 
consider stopping time 

N = inf {n G N : S n > /(n)}, (14) 

where S n = Y^= 1 T* is the partial sum of i.i.d scalar random variables X\, X-i, • ■ ■, which have 
the same distribution as X with mean fi = E[X] and variance a 2 = E[|X — g\ 2 ] < 00 . Assume 
that f(t) is a concave function of t G M + such that /(0) > 0. Assume that there exists a positive 
number m such that 

mg = f(m), (15) 

that is, m = g(g), where the function g(.) is defined by (11211 with the continuity region 


3? = {(t, s) : t G M + , s£l, t < f(s)}. 


Assume that /(f) is differentiable in a neighborhood of f = m. Then, g(v) must be differentiable 
in a neighborhood of v = g. Due to the concavity of the boundary function /(.), the continuity 
region 3? is a convex set. To apply Theorem 1191 to bound the stopping time in (I14p . we can 
calculate V (g) by (fl3l) as follows. 

At a neighborhood of v = g, we have 


g ( v )v = f(g(y)). 
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Differentiating both sides of the above equation with the chain rule yields 


g'{y)v + g{y) = f\g(v))g'{v), 


where /'(.) and </(.) denotes the first derivatives of /(.) and g(.), respectively. Solving the 
equation, we obtain 

// ^ aiy) 

9 W = 777 / \ \ -• 

f\g(v)) - v 

It follows that the first derivative, g'(g), of g(v) at v = g can be obtained as 

9(9) 


9 ( a ) = 77 


m 


9 f'( m ) — g 


Therefore, the gradient is 


V(/x) = 

It follows from Theorem 1191 that 


m 


9'(9) = ___ 

g(g) [f'(m) - / i]g(n) f'(m ) - g ' 


E[N] < m + 1 + ——^. 

- g \ 2 


(16) 


To use formula ([1611 . we need to obtain m from equation ([1511 . In many cases, it is possible to 
derive an explicit expression of m from equation (1151) . Even if m cannot be obtained analytically, 
it can still be readily computed by numerical methods such as the bisection search method. Due 
to the concavity of /(.) and the existence of m satisfying (fl5l) . it must be true that tg > f(t ) for 
large enough t > 0. For example, we can find such value of t as 2 k for some integer k > 0. Then, 
the number m can be obtained by a bisection search from interval (0, 2 fc ). 


6 Bounding Average Stopping Time with Concentration Inequal¬ 
ities 

In this section, we shall propose a method for bounding average stopping times by virtue of 
concentration inequalities. Consider the stopping time defined by ([U). Define 

r = min{T € N : N# > m}, 

where m is the unique positive number such that (m, mg) € d& as defined in assumption (V) of 
Section 4.2. Define 

= {z € : (n, nz) € 

and 

p(n) = inf ||z - g\\ 2 
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for n € N. Let 2zf denote the support of the random index £ such that Ng = N. For £ = 1,2, ■ ■ ■ 
and k = 1, • • • , d, let X k Nf denote the k- th element of X n ( . For k = 1, • • ■ , d, let pk denote the 
fc-th element of p. We have the following results. 


Theorem 20 Suppose that assumptions (I)-(VI) are fulfilled. Then, 

d 

E[N]<N r + (Ni+1 - Ni)J2 Pt 


£>t 

t+iese 


k =1 


X N( — Pk 


^ p(N e )\ ' 

Vd J 


Moreover, if X is a scalar random variable, then, 


IE[iV] < N r + Y, (^+1 - N e) Pl ' i x *t > P + p{Nt)} 

£>t 
£+ lG-Sf 

provided that p is less than the infimum of^N T ; and similarly, 

E[1V] < N t + Y, ( N l+i - N e) Pr { X N e <P~ p{Nt)} 

£>t 

£+ie£? 

provided that p is greater than the supremum of < $n t . 


(17) 


(18) 


(19) 


See Appendix [Rl for a proof. 

It should be noted that for all n € N, is convex due to the convexity of Consequently, 
p(Np) can be readily obtained by convex minimization. Making use of the concept of supporting 
hyperplane, we have the following results. 


Theorem 21 Suppose that assumptions (I)-(VI) are fulfilled. Assume that there exists a 
porting hyperplane As + Bt = C of passing through ( m,mp ) € dStf. Then, 

d 

E[AT] < N r + ( N t+i ~ N e) E Pr 

£>t k=l 

£+ie& 


X N e ~ Pk 


> 


7 e 

Vd 


sup- 


( 20 ) 


where 7 ^ = ^1 — ^===. Moreover, if X is a scalar random variable, then, 

E[AT]<1V t + X] (JV<+i-^) p r|^>^+(l-^) M+f } 


i>T 

£+!€& 


provided that p is less than the infimum of^^ T ; and similarly, 

E[1V] <N r + J2 (^+1 - N e ) Pr ix Nl < p - (l 

£ '> t v- >• 


£>t 

£+l€& 


m 

N't 


B 

A 


( 21 ) 


( 22 ) 


provided that p is greater than the supremum of ■ 


See Appendix [S] for a proof. 

It should be noted that the probabilistic terms in Theorems [20] and [21] can be bounded by 
concentration inequalities such as Chernoff bounds and Hoeffding inequalities mm- 
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7 Bounds for Average Stopping Times of Brownian Motion 

In the last few sections, our techniques for bounding stopping times are devoted to discrete-time 
stochastic processes. Actually, the principle of such techniques can be extended to continuous¬ 
time stochastic processes. To demonstrate this idea, we shall focus on the problem of bounding 
stopping times pertaining to Brownian motion mm- 

Let Wt € be a d-dimensional Brownian motion with mean drift vector /i such that Wo = 0^ 
and E[Wt] = tg for t > 0. Define 

v = jE[\W t - tg\ 2 ], W t = ^f, Vt = (W t -tg) 2 , V t = j 

for t > 0. Making use of Theorem [3], we have obtained the following results. 

Theorem 22 Assume that T is random variable such that E[T] < oo and that for any possible 
value t of T, the event {T = t} depends only on {W T : 0 < r < t}. The following assertions hold. 

(I) : If g is a convex function on then 

E [Tg(W T )\ > E [T\g(g). 

(II) : If g is a convex function of vectors with non-negative elements, then 

E [Tg (Ft)] > E \T\g(y). 

The proof of Theorem 1221 is similar to that of Theorem 21 which is given in Appendix [Dj 
Making use of the convexity of the L p -norm and Theorem [22} we have the following results. 

Theorem 23 Assume that T is random variable such that E[T] < oo and that for any possible 
value t of T , the event {T = t} depends only on {W T : 0 < t < t}. Then, 

E[||Wt|| p ]>E[T]|H| p , 

E[||Vt|| p ] > E[T] |F||p 

for all p > 1. 

Now, consider stopping time T = inf{t > 0 : ( t,Wt ) ^ ^}, where is called the continuity 
region. We have the following result. 

Theorem 24 Assume that AH is a convex set containing (0, 0 f () and that there exists a unique 
positive number t such that (r,Tg) € dAH. Then, E[T] < r. 

See Appendix ITI for a proof. 

Next, consider stopping time T = inf{f > 0 : (t, Wt) € AH}, where AH is called the stopping 
region. We have obtained the following results. 
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Theorem 25 Suppose that the stopping region & is a convex set. Define = {t G M + : (i, tp) € 
S%\. The following assertions hold. 

(I) : E[T] > inf provided that the set sd is nonempty. 

(II) : E[T] < sup^/ provided that E[T] < oo and the set is nonempty. 

(III) : E[T] = oo provided that the set srf is empty. 


See Appendix [U] for a proof. 

Consider stopping times defined in terms of Wt- For stopping time T = inf{i > 0 : t > 
g(w t )}, we have the following result. 

Theorem 26 Assume that g is a concave function on with g(p) > 0. Then, E[T] < g(p). 


See Appendix [V] for a proof. 

For stopping time T = inf |t > 0 : t > ^ , g ( Wt ) > o|, 


result. 


we have derived the following 


Theorem 27 Assume that g is a concave function on with g(p) > 0. 


Then, E[T] > 


See Appendix IW1 for a proof. 


8 Conclusion 

In this paper, we have established a geometric approach for bounding average stopping times. The 
central idea of our approach is to explore the geometric convexity of the continuity or stopping 
regions. Our approach are effective for a wide variety of stopping times which involve random 
vectors, nonlinear boundary, constraint of sample number, etc. Tight bounds are obtained for 
stopping times in a general setting, which are explicit or readily computable. A probabilistic 
characterization is established for convex sets. Extensions are developed for classical results such 
as Jensen’s inequality, Wald’s equations and Lorden’s inequality. 

A Proof of Theorem Q] 

We need some preliminary results. If X is a random variable such that Pr{X < c} = 1, then it is 
clear that E[A] < c. However, it is not so obvious that the inequality is strict. Since such strict 
inequality plays a crucial role in our proof of the theorem, we state it and provide a rigorous proof 
in the sequel. 

Lemma 1 If X is a random variable such that Pr{AT < c} = 1, then E[A] < c. Similarly, if X 
is a random variable such that Pr{A > c} = 1, then E[A] > c. 
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Proof. We claim that there exists a positive number e > 0 such that Pr{A < c — e} > 0. 
To prove the claim, we use a contradiction method. Suppose that the claim is not true. Then, 
Pr{A < c — e} = 0 for any e > 0. It follows that 

Pr{X < c} = lim Pr{X < c — e} = 0. 

This contradicts to the assumption that Pr{X < c} = 1. So, we have proved the claim. 

Now let e > 0 be a positive number such that Pr{X < c — e} >0. Since Pr{A < c} = 1, we 
have 


Epf] — E [X I{A'<c—e}] + E [N I{c—£<A<c}] 

< (c — e) Pr{X < c — e} + cPr{c — e < X < c} 

= (c — e) Pr{X < c — e} + c(l — Pr{A' < c — e}) 

= -ePr{T <c-e} + c<c. 

This proves the first assertion. The second assertion can be shown in a similar way. 


□ 


Lemma 2 Assume that D is a closed convex set and X is a random vector such that PrjA' € 
D} = 1, then K[X\ € D. 


Proof. We shall use a contradiction method. Denote /r = E[Af]. Suppose fj, ^ D, i.e., /a is an 
exterior point of D. By the separating hyperplane theorem |3J Theorem 4.11, page 170], there 
exists a column vector a such that 

/ro: < Za 

for all Z G D. Since Pr{X € D} = 1, it must be true that Pr{^c* < Xa} = 1. From Lemma [U 
we have 

E[ga — Xa] < 0, 


which implies that 


ga < E[Afa] = E[Af]a = na. 


This is a contradiction. The proof of the lemma is thus completed. 


□ 


Lemma 3 If X is a random variable such that 0 < Pr{A < 0} < Pr{X < 0} = 1. Then, 
E[X] < 0. 
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Proof. We claim that there exists a positive number e > 0 such that Pr{A < —e} > 0. To 
prove the claim, we use a contradiction method. Suppose that the claim is not true. Then, 
Pr{X < — e} = 0 for any e > 0. It follows that 

Pr{A < 0} = limPrjX < —e} = 0. 

This contradicts to the assumption that Pr{X < 0} > 0. So, we have proved the claim. 

Now let e > 0 be a positive number such that Pr{A < —e} > 0. Since Pr{X < 0} = 1, we 
have 


E[X] = E[II {x <_ £} ]+E[II { _ £< i < 0} ] 

< —ePrjX < —e} + 0 x Pr{—e < X < 0} 

< —ePrjX < —e} < 0. 

This completes the proof of the lemma. 


□ 


Lemma 4 Assume that D is a closed convex set and X is a random vector such that Pv{X € 
D} = 1 and ^ = E[X] € dD, then there exist a column vector a / 0 and a constant (3 such that 
Pr{Xa + P = 0} = 1. 


Proof. As a consequence of the convexity of D and the assumption that /a = E[X] € dD. 
it is possible to construct a supporting hyperplane Za + (3 = 0 through /i, where a ^ 0 is a 
column vector and f3 is a constant, such that Za + f3 < 0 for all Z E D. By the assumption that 
Pr{A € D} = 1, we have 

Pr {Xa + j3< 0} = 1. 

Since /r is in the supporting hyperplane, we have E[X]a + f3 = . We claim that Pr{A'cc + f3 = 
0} = 1. To prove this claim, we use a contradiction method. Suppose the claim is not true. Then, 


0 < Pr {Xa + f3 < 0} < Pr {Xa + j3 < 0} = 1. 


It follows from Lemma [3] that 


E[Xa + P] < 0. 


This implies that 


E[X]a + P = E [Xa + P] < 0, 


which contradicts to the fact that E[AT]q: + P = 0. The claim is thus established. Hence, it must 
be true that Pr{A'o; + P = 0} = 1. This completes the proof of the lemma. 
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□ 

We are now in a position to prove the theorem. The second assertion follows immediately 
from the notion of convex set and mathematical expectation. So, we only need to show the first 
assertion. Specifically, we need to show that if < 2> is a convex set in K n , then K[X] G ZZ holds 
for any random vector X such that Pr{X G = 1 and that K[X] exists. We shall argue by a 
mathematical induction on the dimension n of Q>. For the dimension n = 1, the convex set @ 
must be an interval of the form 3> = [a, b\, or 3> = ( a,b ), or @ = [ a,b ), or Q) = (a, £>]. Making 
use of Lemma [U it is easy to see K[X] G @ as a consequence of Pr{X G S 1 } = 1. Suppose the 
conclusion K[X] G holds for dimension n — 1. To complete the induction process, we need to 
show, based on such hypothesis, that the conclusion K[X] G S# holds for dimension n. Let 
denotes the closure of ZZ. By Lemma [21 we have shown E[Af] G Ql . If n = E[Af] is not contained 
in the boundary of then it must be true that /i £ f. Hence, to show E[AT] G ZZ for dimension 
n, it suffices to show it under the assumption that fi = E[AT] is contained in the boundary of Q>. 
We proceed as follows. Making use of Lemma [4] and the assumption that fi = E[Af] is contained 
in the boundary of S', we conclude that there exist a column vector a/0 and a constant /3 such 
that Pr {Xa + (3 = 0} = 1. Define 

y = fn{Zer:Za + /3 = 0 }. 

Then, 5? is convex and 

Pr{AT G 5?} = 1, Pr {Xa + p = 0} = 1. 

Without loss of any generality, assume that the i -th element of a, denoted by cti, is nonzero. 
Define a linear transform ST : 5? i —> D such that for every element Z = [z \, ■ ■ ■ ,z n ] in S*, there 
exists a corresponding vector U = [u\, ■ ■ ■ ,u n } = &{Z) such that 

Ui = Za + f3, ug = zg, t € {1, • • • ,n} \ {i} 

or equivalently, 

U = Z(I + a.ei — ej ej) + /3ej, (23) 

where I is an identity matrix of size n x n and e t is a row matrix with all elements being 0 except 
the Lth element being 1. Note that D = {JZ(Z) : Z G =5^} must be convex because the transform 
SZ is linear and 5 is convex. Define Y = [ 2 / 1 , - - - ,y n ] = &{X)- Then, 

Pr{Y G D} = 1, Pr{yj = 0} = Pr{A'a + (3 = 0} = 1 

and E[yJ = 0. Define 

D* = {[iti, • • • ,Ui,u i+ 1 , ■■■,«„]: [ui, ■ ■ ■ , u n \ G D}. 
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Then, D* is convex because D is convex. Define random vector V = [u i,• • • , u n _i] such that 
V£ = y e , t = 1, • • • , i — 1 and V£ = y f+1 , £ = i, • • • , n — 1. Then, Pr{ V € D*} = 1. Since D* is 
a convex set of (n — 1) dimension and Pr-JT € D*} = 1, it follows from the induction hypothesis 
that E[y] € D*. This implies that E[V] € D. 

It can be checked that the determinant of the matrix I + cue* — ej e* in (1231) is equal to ctj, 
which is nonzero. Hence, / + ae* — ej e* is invertible, and it follows that 

Z = (U — (I + aei - ej e*) -1 . 

This implies that the transform Z? is a one-to-one mapping from 5? to D and thus the transform 
is invertible. Note that E[Y"] = £7(E[X\) and the transform Zf maps 5? into D. Now, we have 
E[y] € D. Taking the inverse transform of Z? yields 

e [ x ] eye®. 

This completes the process of induction and the theorem is thus established. 


B Proof of Theorem [2] 

To show the first assertion, it suffices to show that for any (tg,si), £ = 1, • ■ ■ ,k such that t£ > 
0, y £ and positive numbers Xg, £ = !,-■■ ,k such that Yle=i ^i = h 


/ ( ^ ^ ^ £S i ) — s i)- 


1 =l 


1=1 


l=\ 


Define 


A — ^2 P £ — 


i=i 


A £t£ 

~A~’ 


= !,••• ,k. 


Since p£ , £ = 1, • • • , k are positive numbers satisfying Yli=i Pi = 1 and the function g is convex, 
we have 

Etl X l s i \ 


v- ( s i\ ^ (sr s i\ (x'teesA 


i=i 


\i=i 


i=i 


A 
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It follows that 


K K / 

22 x ef(te,s e ) = 22 X(t( - g ( 


1=1 


i=\ 


si 

\te 


i=i v *■ 


> Ag 


Y2l= i 

A 




v£=l 


/ Xet ^ X/ 


\£=1 £=1 / 

This proves the first assertion. The second assertion can be shown in a similar way. 


C Proof of Theorem [3] 


We shall only show the first assertion, since the second assertion can be shown in a similar way. 
Define f(t,s ) = tg (|). Since g(z) is a convex function of z € it follows from Theorem [5] that 
f(t,s ) is a convex function of t > 0 and vector s such that | € S>. Hence, there exist a column 
vector a and number f3 such that 


/(*, s) > /(E[y],E[Z]) + (a - E[Z])a + f3(t - E[T]) 


for t > 0 and vector s such that | € ^. As a consequence of this result and the assumption that 
Y > 0, y € ||^j € we have 

f(X, Z ) > /(E[y],E[Z]) + (Z - E[Z])a + (3(Y - E[T]). 


Applying the definition of the function / to the above inequality yields 

Yg (f ) > E [Y]g ) + (Z - E[Z])a + f*(Y - E[T]). 
Taking expectations on both sides leads to 


E 




> W)g 



+ E[Z - E[Z]]a + /3E[Y - E[Y]] = E [Y]g 
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D Proof of Theorem [4] 


To show the first assertion, we can use the first assertion of Theorem [3] to conclude that 

N N 


E[Ng(X N )] = IE 




N 


> E[iV] 5 


i=l 


E 


E [N] 


N 


E x ‘ 


- 1=1 


By virtue of Wald’s first equation, we have 

N 


E 


E- Y < 


i= 1 


= E[N]g. 


Hence, 


1 


E [Ng(X N )} > E[iV]$ ( ^E[iV]^ ) = E[N]g(g) 


To show the second assertion, we can use the first assertion of Theorem [3] to conclude that 

2\ 


E[Ng(V N )] = E 




> E[JV]g 

By virtue of Wald’s second equation, we have 


N 

1 


E [N] 


< %— 1 


E 


N 


Y2 Xi ~ N >- 1 


\i=l 


E 


N 


Y, x *~ N > i 


\i= 1 


= E[N]v. 


Hence, 


E [JV^(^jv)] > E[N]g ( ^E[iV]z,) = E [N}g(v). 


E Proof of Theorem [7] 

Define £ = A — Z\. Let Tg(.) denotes the cumulative distribution of £. Note that 


E 


Y^Zi-iX-Z^) I 


{Zi<\} 


<i =2 


= E 


E z .~? 


4c>o} 


a=2 


f E 

1 

.N 

I 

‘Fry 

II 
e 

_i 

lu>0 

_\i =2 / 


dF^u). (24) 
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By the definition of ^\, we have 


IE 


I>-f if 


= u 


K i =2 


= E 




22 Zl ~ u ) I £ = u 


,1=2 


(25) 


where 


A4 U = inf < n > 2 : Zj > 


u 


i =2 


Since the samples Zi, Z 2 , • ■ ■ and A are independent, it follows that £ and Z 2 , Z 3 , • • • are indepen¬ 
dent. It follows that 


E 


'Mu \ 

^ Zj - w | £ = u 


L \ 2=2 


= E 


y ^Zj-u 

L\i=2 J 


(26) 


for all u > 0. Define 


5Ut u = inf < n € N : ^ Z* > 


2=1 


for u > 0. Since Zi, Z 2 , • • • are i.i.d. samples of Z, it must be true that Yli=% Z i and Y2=l z i 
have the same distribution for all u > 0. Hence, 


E 


'Mu N 

22 Z i-U 

L\ i=2 ) 


= E 


E z ‘ 


_ \ 2=1 


for all u > 0. Combining (fMl) ([271) yields 


E 


(EZi-^-Zr)! 


E 


' u>0 


— U 


'2K„ 


(27) 


X> 


— u 


_ \ 2 = 1 


<iFg (u) 


(28) 


By Lor den’s inequality im we have 


E 




X> 


— u 


_ \i=l 


< 


E[(Z +) 5 

E[Z] 


(29) 


for all u > 0. Making use of (081) and (129|) . we have 


E 


22 z*—( a— z{) j 11 


{Z 1<A} 


L \i =2 


< 


L 


E[(Z +) 2 


u> 0 ®[-^] 

E[(^ + ) 2 ] [ 

^\ z \ Ju >0 


dF, c (it) 

dF ? (u) 


e[(^ + ) 2 : 

E[Z] 

e[(^ + ) 2 ; 

E[Z] 

n(. z+ r. 

E[Z] 


Pr{£ > 0} 

Pr{A - Zi > 0} 
Pr{Z < A}. 


(30) 
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On the other hand, 


(31) 


HRx I^a}] = E[(Zi - A)+] = E[(Z - A)+], 

Combining ( 1501 ) and (l3Tj) yields 

K[Rx] = E[i?A Iiz^A}] + mx I[ Zl >x}] < Pr ( Z < A i + E l(Z - A ) + l- 

This completes the proof of the theorem. 


F Proof of Theorem [8] 


We need some preliminary result. 

Lemma 5 Let X\. X 2 , • • • be i.i.d. samples of positive random variable X such that E[X 2 ] < oo. 
Define S n = f or ^ G N. Let Ni, N 2 , ■ ■ ■ be an increasing sequence of positive integers. 

Define h = sup£ >0 (W+i — Ni) with Nq = 0. Define N t = inf{n € JV : Sn > t} for t > 0, where 
jV = {N\, N 2 , ■ ■ ■ }. Define R t = Sn^ — t. Then, E[f? t ] < {h — 1)E[X] + for any t > 0. 


Proof. Let t > 0. Define Mf as the largest integer which is less than Nf and taking value in 
the set {IVo, IVi, JV 2 , • • • }. Define 

A ft = inf{n G N : S n > f}. 


We claim that 


Sh-l+jVt — 5 N t ■ 

To show this claim, note that Sk is increasing with respect to k G N as a consequence of X > 0. 
Since 


SMt <t< Sj\f t , 

we have 

M t < Aft ~ 1. 

By the definition of h, we have 


SWt < Sh+Mt < Sh-i+Xf 

The claim is thus true. If follows that 


E [S Nt \ < E[S fc _ 1+M ]. 

Since h — 1 + A f t is a stopping time, by Wald’s first equation, we have 

E[S h _i +A f t ] = (EjA/'t] + h- 1)E[X]. 
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Therefore 


E[SW t ~t] < EfS'fe-i+jVt - t\ 

= E [S h - 1+A r t - 5m] + e[Sm - 1 ] 

= E[S h _ 1+M ] - E[5m] + E[5m - t] 

= (E[Af t ] +h- 1)E[X] - E[A/’t]E[X] + E[5m - t] 
= (h- 1)E[AT] +E[5m -*]• 


By Lorden’s inequality 


E[5m 


d<E!l 

J - E[X] 


Hence, 

E[^] = nS Nt -t]<(h- 1)E[X] + 
This completes the proof of the lemma. 


□ 


We are now in a position to prove the theorem. Define £ = A — Y. Let i^(.) denotes the 
cumulative distribution of £. Note that 



' / \ 


' / \ 

E 

E Z *-( X ~ Y ) v<ai 

= E 

E z <-( \ %>»> 


_ \i=N !+1 / 


yj=AT 1 + l J 


E 


' u> 0 


E z ‘-(\ if 

l i=jVi+l 


= U 


dF^(u). (32) 


By the definition of -M \, we have 



' / .//a \ 


’ / Mu \ 

E 

E z <-f if=» 

= E 

E u 1 c = u 


y^ATi+i / 


yi=7v 1+ i J 


where 


(33) 


A4 U = inf < n € JV : n > IV 2 , Zj > 

[ t=JVi+l 

Since the samples Z\, Z 2 , ■ ■ ■ and A are independent, it follows that £ and Z 2 , Z 3 , ■ ■ ■ are indepen¬ 
dent. It follows that 



" / Mu \ 


'( M u y 

E 

E Z i~ U ) \£ = u 

= E 

E 


_ \i=JVi+l J 


_ \i=JVi+l J _ 


(34) 
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for all u > 0. Define 


u 


for u > 0, where 


9Jt u = inf < n € 91 : Z % > 

[ i=i 

Vl = {N t -N 1 :l = 2,3, 


Since Zi, Z 2 , • • • are i.i.d. samples of Z, it must be true that X)i=jVi+i Zj and ^i=i have the 
same distribution for all u > 0. Hence, 




(35) 



’ / M u \" 


~/m u \- 

E 

E Z.-u 

= E 

E z <-“ 


_ \i=JVi+l / _ 


A »=i /- 


for all u > 0. Combining (f32l) (flail yields 



( \ 

r 

■ /m u \" 

E 

< 

V 

N 

1=1 

A 

1 

1 

w 

= E 

E z <-“ 


yi=iVi+l J 

J u> 0 

Ai=l J. 


dFz( 


u 


By Lemma O we have 


E 




E z - 


— u 


_ \i=l 


< (K - 1)E[Z] + 


E[Z 2 ] 

E[Z] 


(36) 


(37) 


for all u > 0. Making use of (f36l) and ([37]) . we have 


E 


J(\ 


E ^ I {y<A} 

. i=JVi+l 


< 


' u>0 


(K - 1)E[Z] + 
= ( (K — 1)E[Z] + 


E[(Z + f 
E[Z] 

E[(Z+) 2 ] 


E [Z] 


' u> 0 


dF^(u) 

dFf(u) 


= ((iL-l)E[Z] + E[ ^ ] ) Pr{g > 0} 

= ( (K - 1)E[Z] + E ^f ^ ^ ) Pr{A - Y > 0} 


E[Z] 

= ( (K - 1)E[Z] + E[ ^ )2] ) Pr{E < A}. (38) 


On the other hand, 


Combining (l38l) and (l39l) yields 
E[.Ra] = I{y<A}] + E[7 ?a I{f>a}] < ( {K — 1)E[Z] + 

This completes the proof of the theorem. 


E[R X I {y >A } ] = E[(y - A)+], 

E[Z 2 ] 


(39) 


E[Z] 


Pr{y < A} +E[(y - A)+]. 
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G Proof of Theorem [9] 


We shall first show E [N] > min g/ under the assumption that srf is not empty. If E [N] = oo, 
then E[IV] > min g/ trivially holds. If E[AT] < oo, then Pr{IV < 00 } = 1 and it follows that Sn 
is well-defined and 

Pr{(IV, Sn) = 1. 

According to Theorem [lj we have 


(E[1V], E[5jy]) € £%■ 

Since E[IV] < 00 , it follows from Wald’s equation that E[<Sjv] = E[7V]^. Hence, 

(E[AT], E[N]p)£&, 

which immediately implies E [TV] > ming/. 

It remains to show that E[AT] = 00 under the assumption that g/ is empty. We use a 
contradiction method. Suppose that E[IV] < 00 , then Pr{IV < 00 } = 1 and it follows that 

Pr{(IV, Sn) € 2%} = 1. 


According to Theorem [U we have 


(E [N], E[Sjv]) 

Since E[AI] < 00 , it follows from Wald’s equation that E[Sjv] = E[IV]/r. Hence, 

(E[IV], E[N]n)££g, 

which immediately implies that g / is not empty. This is a contradiction. Therefore, it must be 
true that E[7V] = 00 if g/ is empty. The proof of the theorem is thus completed. 

H Proof of Theorem [TO 

We need some preliminary results. 

Lemma 6 There exist A, B and a positive number C such that As + Bn = C is a supporting 
hyperplane of which passes through (m, mp). 
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Proof. Since 2% is a convex set containing (0, 0^) and {m, mp), it must be true that £% contains 
{pm,pmp) for all p € (0,1). Since 2% is a convex set, there exist A, B and C > 0 such that 
As + Bn = C is a supporting hyperplane of 2% such that m{Ap + B) = C. We claim that C > 0. 
To prove this claim, we use a contradiction method. Suppose (7 = 0. Then, the supporting 
hyperplane must contain (0,0^). It follows that the supporting hyperplane contains {pm, pmp) 
for all p £ (0,1). Recall the assumption that there exists a unique positive number t such that 
(f, tp) £ 82%. Hence, there exists p £ (0,1) such that {pm, pmp) ^ 82% and {pm, pmp) £ 2%. 
This implies that the supporting hyperplane contains some interior point of 2%. This contradicts 
to the definition of a supporting hyperplane. Thus, we have shown the claim that C > 0. This 
completes the proof of the lemma. 

□ 

Lemma 7 There exist S > 0 and T > 0 such that {N > n} C {||7L n — I 2 > 5} for n € jY 
greater than T. 

Proof. According to Lemma[ 6 l there exist A, B and a positive number C such that As + Bn = C 
is a supporting hyperplane of which passes through {m,mp). Define 

T = inf{n £ jV : As + Bn > (7}. 

Since (7 > 0 and (0, 0^) is contained by the convex set it follows that As + Bt < C for all 
{t, s ) € 2?. This implies that N < T. Since (7 > 0, we have Ap, + B = ^ > 0. Since Ad + B is a 
continuous function of 6 , there exist <5 > 0 and T > 0 such that 

A6 + B>j 

for all 6 £ {6 £ : ||0 — p \\2 < 5}. Hence, 

{AT > n} C {T > n} C {n{AX n + B) < C} C j AX n + B < ^ j C {\\X n - p \\ 2 > 5} 
for n € JY greater than T. This completes the proof of the lemma. 

□ 

Lemma 8 Let Y\, Y 2 , ■■■ be i.i.d. samples of scalar random variable Y which has mean a = E[y], 
variance v = E[|Y" — a\ 2 ] and finite E[|y| 3 ]. Let Y n = 2 Y ™ =1 Y.Then, 

— ( ny 2 \ ‘ZtfW 

Pr{|y n - a| > 7 } < exp +—n-^ for any 7 > 0, (40) 

\ 2 v ) n z 7 d 

where W = E[|y — o| 3 ] and > 0 is an absolute constant. 
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Proof. Note that 


and 


Pr {Y n <z} = 


P i{Y n > z} = Pr 


} = pr f y/n(Y n -a) < y / n(z - a ) j 

l ~ I 


y/n(a - YjJ y/n(a - z ) 


^ y/v 

Since E[|F| 3 ] is finite, W must be finite. By the non-uniform version of Berry-Essen’s inequality 
[3 Page 44, Theorem 6.4], 


Pr{Y n < z} — 


y/n(z — a) 


< 


c ew 


< 


y/nv 3 + n 2 \z — ol 3 n 2 \z — a 


for ^ € R. Making use of (14T1) and the fact that <P(x) < i exp(—for any x < 0, we have 


Pr {Y n < z} < - exp -- 


In a similar manner, we can show 


n z — a 


2 2 
2x «jf W 


+ 


n 2 \z — a| 3 


for z less than a. 


Pr {Y n > z} < - exp ( — 


n z — a\ 


n 2 I z — al 3 


for z greater than a. 


(41) 


(42) 


(43) 


Finally, combining (1421) and (1431) yields (1401) . This completes the proof of the lemma. 


□ 


We are now in a position to prove the theorem. For i = 1, • • • , d, let x^ be the z-th component 
of X, i.e., X = [*i, ■ ■ ■ , Xd]. Let m = E[®i] for i = 1, • • • , d. For i = 1, • • • , d, let xij, j = 1,2,--- 
be i.i.d. samples of Xi such that Xi = [xn, • • • , Xi d ]- Define Xi n = L ^” =] x t j for i = 1, ■ • ■ , d. 
Then, X n = [xi n , ■ ■ ■ ,x dn ] and 

d 

||-^"n A 4 112 — /* ' |®m Ah| ■ 
i =1 


By Lemma [3 we have that there exist 5 > 0 and T > 0 such that {N > n) C {||X n — /x[ [2 > 
for n € YV greater than T. Hence, 


Pr{N > n} 


< 


< 


Pr{||X n -//|| 2 >6} 
Pr{||X n -HH ><5 2 } 

Pr j ^ \x in - ml 2 > 6 2 


E Pr 

i=z 1 

d 

Err 


, 2 o‘ \ 

•Kin | ^ ^ f 

•Kin 
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for n € jV greater than T. By virtue of Lemma [U we have that there exist constants c\ > 0 and 
C 2 > 0 such that 


Pr \ \x in - Hi\ > — > < exp(-nci) + 


\fd 


C2 


n“ 


for i = 1, • • • , d. Hence, 


C2 


n 2 J 


Pr{iV > n} < d exp(—nci) + 
for n € JY greater than T. Let r > 1 be an integer such that IV T _i < T < N r . Note that 

TV- 7 — 1 oo 


(44) 


E [N] = Y Fl '{ N > n l + Pl i N > n i 


n =0 


n=N T 


< n t + Y Pr {-W > n } 

n=N r 

oo -^+i — 1 

= N T + E E 

i=r n=Ng 


< 


Nr + 'Y J ( N e+i - N e ) Pr{AT > Ni}. 


(45) 


1=7 


Combining (fHl) and (USD yields 

CO 

E [N] < N T + Y( N e+i~ N e) 


£=r 

CO 


C 2 


exp(-ciA^) + -^2 


Nr + YXN i+1 - Nl) exp(-ciA^) + c 2 Y 


£=t 


i=T 


Ni +1 — Ni 
N 2 ' 


(46) 


If limsup^^ (Ni + \ — Ni) < oo, then there exists a constant B > 0 such that N(_ + \ — Ni < B for 
all i > r. It follows that 

~ ~ 1 


E[AT] < IV r + H^exp(-ciiV,) + c 2 H^^2 


l=T 

OO 


1=7 

CO 


< N r + B Y^ exp(-cin) + c 2 £> Y, ^2 


71 = 1 


n=l 


= iv r + 




c 2 £>7r 2 
H--— < 00 . 


exp(ci) — 1 6 

It remains to show the boundedness of E [N] in the case of liminf^oo > 1. Making use 
of (1461) and the assumption that there exist numbers A > 0 and K such that Ni + \ < A Ni + K for 
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all l > 0, we have 


E[N] < N T + ^2(XN e + K-N e )exp{-c 1 N e ) + c 2 Y^ 


£=t 


i=T 


A N, + K-N t 


LXJ LXJ 1 LAJ LXJ 1 

jVr + exp(-ciiVf) + c 2 V y^^ + (A-l)^iV, exp(-ciA^) + (A - 1) ^ 

£=t ^ £=r 


£=r 

oo 


£=j 


Ni 


< iV r + AT ^2 exp(— c\n) + c 2 K ^ ^ + IA - 1| ^nexp(— c\n) + (A - 1) 'Y — 


= N t + 

Note that 

because 


72—1 

K 


72=1 


72=1 


jr 2 00 oo 

+ C2 + IA — 1| Y nexp(-cin) + (A - 1) . 


72=1 


exp(ci) — 1 6 

n exp(— c\n) < 

72=1 

(n + 1) exp(—ci(n + 1)) 


t=T 


oo 


lim 


n->c« nexp(—cin) 

As a consequence of liminf^oo > 1, we have 


= exp(—ci) < 1. 


OO 

Y- 

/v 




< oo. 


Therefore, it must be true that E [N] < oo in the case of liminf£_ s . 00 > 1. The proof of the 
theorem is thus completed. 


I Proof of Theorem [TT1 

Since assumptions (I) - (VI) are fulfilled, it follows from Theorem [TUI that E[IW] < E[IV] < oo, 
which implies Pr{IW < oo} = 1 and Pr{IV < oo} = 1. Hence, N and M are well-defined random 
variables. Define 

A = S N - S M - (AT - M)/i. 

Our proof of the theorem relies on some properties of A as stated by the following lemma. 

Lemma 9 


E[A+] < E[IV - V 0 ] E[(X - /r)+], 

(47) 

E[A~] <E[N - N 0 \E[(X - ix)~], 

(48) 

E[|A|] < E[AT — Nq] E[|Af — /z|]. 

(49) 
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Proof. Define 

Ag = Sn ( — 5jv € _j — (Ng — Ng-\)n 

for £ = 1,2, ■ • ■ . Let r denote the stopping index such that iV T = N. Note that 

OO 

E[zA+] = ^E[A+I {t=£} ] 

£=1 

OO 

= e e [^ +i {t=h] 

£=1 

OO 

= E[/A+I {r= 1 } ]+^E[Z\+I {r= , } ] 

i=2 

OO 

< mi i{r=i } ] + E mi hT>t-i}] 

i=2 

oo 

< EK] + ^E[Z\+I {t> ,_ 1} ]. 

g=2 

Observing that Ag depends only on {X n : Ng- 1 + 1 < n < I\fy} and that the event {t > £ — 1} 
depends only on {X n : 1 < n < Ng_{\, we have that 

mt I{r>^i}] = E[Z\+]E[I {t> ,_ 1} ] = E [Aj] Pr{r > £ - 1} 

for £ > 1. It follows that 

OO OO 

m + ] < mi] + E E ^ + Will = mi) + E E ^ + ] Pi- { r > £ ~ 1 }- 

g = 2 £=2 

As a consequence of the identical independence of X\, X 2 , ■ ■ ■, we have 

E [A+] < (N e - Ng-g)E[(X - fi)+], 1 = 1,2,---. 


Hence, 


E[zl + ] < 


Ni - n 0 + x>* +1 - N t) m-t > 


g=1 


e [{x-»y 


= E[N-No]M[(X-ii)+]. 


This proves (1471) . By similar arguments we can show the inequalities (1481) and (1491) regarding 
E[Z\ _ ] and E[|zA|], respectively. 

□ 


Lemma 10 

E [M]a + C < E[5 m ] < E [M]/3 + r], 
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where 

a = l-i - AE[(A - m) + ], C = (N 0 - K ) E[(X - M )+], 

P = V + AE[(X - n)~], rj = (K — Nq) E[(X - n)~]. 

Proof. By the assumption that E[| A| 3 ] is bounded, we have that E[Ai + ] and E[X - ] are bounded. 
By Theorem 1101 N is a stopping time such that E[JV] < oo. Hence, it follows from Wald’s first 
equation that 

N N 

E[(5jv) + ] < ]T(W) + = E[iV]E[X+], E[(5 jv)"] < = E[iV]E[X"]. 

i =1 i —1 

Thus, 

E[|5iv|] < max{E[(S A r) + ],E[(5 A r)"]}. (50) 

By the definition of A, we have 

S M = S N - A-(N - M)fi. (51) 

Hence, 

E[|Sjvr|] < E[|5 jv|] + E[|A|] + E[1V — M]\n\. (52) 

From (1491) of Lemma [9l we have 

E[|4\|] < E[JV — IVo] E[|AT — fi\]. (53) 

Since E[1W] < E[AT] < oo, we have 

E [N -M}< oo. (54) 

Combining (1501) (1541) leads to the boundedness of E[|Sm|]- This establishes the existence of 

E[5m]- Taking expectations on both sides of (1511) yields 

E [S M ] = E[5at — A— (N — M)/j] 

= E[Sjv] - E[Z\] - E[AT]^ + E[M]n 

= E[iV]/x - E[A] - E[iV]/r + E[M]/i (55) 

= E[M]/i-E[A] 

= E[M]fi-E[A + ]+E[A~], (56) 

where we have used Wald’s equation E[Sjv] = E[JV]/x in (1551) . As a consequence of (1561) . we have 

E [M]p, - E[Z\ + ] < E[5 m ] < E[M]fi + E[A~]. (57) 
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In view of N < A M + K, we have 


E [N - iV 0 ] < AE[M] + K - N 0 . 


Making use of (1481) . (f58j) and the second inequality of (f57l) . we have 


E [5 m ] < E[M]fi + E[A~] 

< E[M]n + E[N - N 0 ]E[(X - n)~] 

< E [M]fi + (AE[Af] +K- Nq) E[(X - //)“] 

= E [M] {n + XE[{X - //)-]} + (K - N 0 ) E[(X - //)“] 
= E[M]j3 + 77. 


Making use of (TT7D . (1551) and first inequality of (1571) . we have 

E[S m ] > E[M]fi-E[A + ] 

> E[M]fi - E[AT - N 0 ] E[(X - n) + ] 

> E [M]fi - (AE[M] + K - N 0 ) E[(X - p )+] 

= E[M] {n - AE[(X - /i)+]} - (K - N 0 ) E[(X - /z)+] 
= E[M]a + C- 


This completes the proof of the lemma. 

We are now in a position to prove the theorem. By the definition of M, we have 

Pr{(M,S M )G#} = l. 

Since E[iW] < 00 and E[5 m] exists, it follows from Theorem [0 that 

(E[M\,E[S M ])e<%. 

The conclusion of the theorem immediately follows from this fact and Lemma (HOD . 


(58) 


□ 


J Proof of Theorem [12 

We need some preliminary results. 

Lemma 11 Let X be a random variable with mean fi such that Pr{a < X < b} = 1. If g(x) is a 
convex function of x € [a, b\, then 

E^pf)] < ^-^-[(6 ~ fi)g(a) + (M - a)g(b)]. (59) 
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In particular, 


E[(*-^< ( "7l (t B ~' l) . (60) 

E[(X-ri-]< ( '‘7 )(6 ~' ,) . (61) 

b — a 

Proof. To show (1591) . note that, as a consequence of the convexity of the function g, 

g(x) < 9 ^ b _ (x - a) + g(a), x€[a,b\. 

By the assumption that Pr{a < X < b} = 1, we have 

g{x) < _ a (a} 
b — a 

almost surely. Taking expectation on both sides of the above inequality yields 

E [g( x )} < —i —- a] + g(a ) = —[(b - g)g(a) + (g - a)g{b)}. 
b — a b — a 

This establishes ([59]) . Applying (159)) to convex functions g(x) = maxja; — g,0} and g(x) = 
max{/i — ®,0} yields ( 1601 ) and ( 1611 ) . respectively. 

□ 


Lemma 12 Suppose that Pr{a < X < b} = 1, where a, b € M d , and that the assumptions (I) - 
(V) are fulfilled. Define 


(/i - a){b-fi) 

b — a 


a = g — An, /3 = g + An, £ = (N 0 - K)v, g = -( 


and 


a' = b + A(/i — 6 ), /3' = a + X(g — a), (' = K(g — b), g' = K(/x — a). 


Then, 


E [M]a + C < E [S M ] < E [M]j3 + g, (62) 

E[M]a' + C' < E[5 m ] < E[M]/3' + gf. (63) 

Proof. Since Pr{a < X < b} = 1 and assumptions (I) - (V) are fulfilled, it follows from 
Theorem flOl that E[iVf] < E[iV] < oo. Since Pr{a < X < b} = 1, it follows from Lemma [Til that 

E[(A - /r)+] < n, 

E[{X-p)~] < v. 
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(64) 

(65) 









Making use of (1641) . (1651) and Lemma El we have 

E[2\ + ] < E[N - N 0 ] E[(X - /i) + ] < E[N - N 0 ]v, 
E[A~] < E[AT - N 0 ] E[(X - //)“] < E [N - N 0 ]v. 

Making use of (1571) . (1551) . (1571) and the fact that N < XM + K, we have 

E[S m ] < E[M]n + E[A~] 

< E [M]y, + E[N - iVo]u 

< E[M]n + E[A M + K- iVo]u 
= E [M]P + r], 

and 


E[5m] > E[M](i-E[A + ] 

> E[M]n - E[N - N 0 ]v 

> E[M]n - E[AM + K - No]v 
= E[M]a + (. 


This proves (1621) . It remains to show (1631) . Recall that 


N 


a = s n -s m -(n-m)v= E 

i=M +1 

Since Pr{a < X < b} = 1, it follows that 

(X — /i) + <b — fi, (X — //)“ < /r — a 


almost surely. Hence, 


E[ZX + ] < E 


E ( x i~ri + 

,i=M +1 


< E[N - M](b-fi), 


N 


E[A~] < E 


E ( x i-»r 

.i=M +1 


<E[N - M](fj, - a). 


Making use of (1571) . (1551) . (1591) and the fact that N < A M + K. we have 


E[S m ] < E[M]fi + E[A~] 

< E[M]fi + E[N-M](fi-a) 

< E[M]n + {(A — l)E[IVf] + K} (n — a) 
= E[M][X(/i — a) + a] + K(fj, — a) 

= E [M]0' + rf, 


( 66 ) 

(67) 


( 68 ) 

(69) 
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and 


E[S m ] > E[M]fi-E[A + ] 

> E[M]n-E[N - M]{b-fi) 

> E[M]n - {(X - 1)E[M] + K} (b - fi) 
= E[M][b- X(b-fi)] + K(n-b) 

= E[M]a' + C'. 

This proves (1631) . The proof of the lemma is thus completed. 


□ 


We are now in a position to prove the theorem. By the definition of M, we have Pr{(iVf, Sm ) £ 
= 1. Since E[JVf] < oo and E[5 m] exists, it follows from Theorem |T] that (E[Af], E[5 m]) € 8 &. 
The conclusion of the theorem immediately follows from this fact and Lemma [T2l 


K Proof of Theorem [13 

We need some preliminary results. 

Lemma 13 E[iV] < oo. 

Proof. Since fi is an interior point of the convex set D, there exists a number <5 > 0 such that 
{6 € M. d : ||$ — n \\2 < (5} C D. Since g is a concave function on D, it must be a continuous function 
on D. By the bounded-value theorem, we have that there exists a positive number T such that 
g( 6 ) < T for any 6 contained in the set {6 £ E d : \\9 — g \\2 < 5}. This implies that 

{N>n}C{\\X n -n\\ 2 >6} 

for any n € jV greater than T. Let r > 1 be an integer such that iV T _i < T < N r . Using the 
same technique as that for proving (|46l) . we can show that 

E[iV] <N t + ^(iV £+1 - N e ) exp(-ciiV £ ) + c 2 ^ ^ (70) 

l=T i=T ^ 

where ci and C 2 are some positive constants. As a consequence of W+i — A T e < K, t = 0,1,2, • • •, 
the right hand side of (1701) can be readily shown to be bounded. 

□ 


Lemma 14 


E[M] < g 


V E[M] ) ■ 
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Proof. From Lemma 1131 we have E[JVf] < E[iV] < oo, it follows that Sm is well-defined and 
that E[5'jw] exists. By the definition of M . we have 


M < g 


Sm 

M 


almost surely. Multiplying both sides of the above inequality by M yields 

' Sm' 


M < Mg 


M 


almost surely. Taking expectation on both sides of the above inequality and using Jensen’s 
inequality yields 


E 2 [M] < E[M 2 ] < E 


Mg 


Sm 

M 


Since M is a positive random variable and g is a concave function, it follows from Theorem [3] 
that 


E 


Mg 


Sm 

M 




Hence, 


E 2 [M] < E [M}g 


E[S M ]\ 


E [M] J 

Dividing both sides of the above inequality by E[AT] yields 




□ 


We are now in a position to prove the theorem. By the same argument as that used in the 
proof of Theorem fTTl we can show that 

E [M]fi - E[iV - N 0 ] E[(X - /r)+] < E[S M ] < E [M]g + E[AT - N 0 } E[(X - g)~]. 

As a consequence of N^ + \ — Ni < K < Nq, t = 0,1,2, • • • , we have that N — Nq < M and it 
follows that 

E [M]fi - E[M] E[(X - /r)+] < E [S M ] < E [M]g + E[M] E[(X - g)~]. 

Hence, 

a = g~ E[(X - M ) + ] < < H + E[(X - »)~] = f). 

Finally, the conclusion of the theorem follows from the above inequality and Lemma [Ml 
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L Proof of Theorem [14 


Define 

A/"(e) = inf{n €N:n>l + e + g{X n )}, 

where e > 0. Define K = 1, N 0 = 1 and Nf = l + 1 for i € N. Then, < K < N 0 for 

i = 0,1, 2, • • • . Since g is non-negative, it must be true that Nq < 1 + e + g{X n 0 ) is a sure event. 
Therefore, we can apply Theorem [13] to stopping time Jv(e) to conclude that 


ELV(e)l < 1 + max[l + e + g(9)\ < 2 + e + max g(9). 


Since N < A/”(e), we have 


E [N] < E[A/’(e)] < 2 + e + max g(9). 

Since the above inequalities hold for arbitrarily small e > 0, it must be true that E [N] < 2 + 
rna xg^g g(6). This completes the proof of the theorem. 


M Proof of Theorem [T5 

As a consequence of the assumption that Pr{a < X < b} = 1, where a, b e M d , it must be true 
that each element of E[|A| 3 ] is finite. It follows from Theorem 1131 that 

ELZV] < K + maxg(0), 

6 »e® 

where 3> = {6 £ D \ a < 6 < f3} with a = g — E[(X — /r) + ] and /3 = g + E[(X — g)~}. By (1601) 
and (1611) of Lemma mi we have 

a = g — E[(X — g) + ] = g — v 

and 

(3 = g + E[(A - g)~] = g + v, 
respectively. This completes the proof of the lemma. 


N Proof of Theorem [16 

To prove Theorem 1161 we need some preliminary results. 

Lemma 15 Let p, g € R. Let 9 and 9 be column vectors such that 9 T , 9 T € W l . Define 

£ = {q e I d : 0 d < q < l d } 
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and "f = {q G J : each element of q assumes value 0 or 1 }. Assume that p + qQ > 0 for all 
q G y. Then, 


max 


Q + qd 
p + qQ 


: q £ J> = max 


Q + qd 
p+q 6 


■ qery 


Proof. We shall prove the result of the lemma by a mathematical induction on the dimension 
d. For d = 1, we have that J2 is the interval [0,1] and that 'f' = {0,1}. In this case, 6 , d and q 
are scalars. If 9 = 0, then 

Q + qd _ Q + qd 
p + q9 p 

which is a linear function of q € [0,1]. If 9 ^ 0, then 

Q + qd = q-p + 0 {% + gf) = q- p e 
P + qO p + qQ p + qQ 

which is a monotone function of q G [0,1]. So, is a monotone function of q G [0,1], regardless 
the value of Q. As a consequence of such monotonicity, we have 


max 


Q + qd 

p + qQ 


q e [o, l] 


max 


Q + qd 
p + qQ 


: q s {o, 1} 


This proves that the result of the lemma holds when the dimension d is equal to 1. Now we 
assume that the result of the lemma holds when the dimension d is equal to k > 1. Based on 
such induction hypothesis, we need to show that the result of the lemma also holds when the 
dimension d is equal to k + 1. This amounts to prove 


max 


Q+qd 

p + qQ 


: q <G £ k +l 


max 


Q + qd 

p + qQ 


■ q G %+1 


Here, Q and d are column vectors of size (k + 1) x 1, 


<S k = {q ■ 0 d < q < l d } 


and if, = {q € £?k '■ each element of q assumes value 0 or 1 } for k € N. Note that 

■.q"e£ k , q'e [0,l]j. (71) 

Here, q' is the first element of q, 

q" is a column vector obtained by eliminating the first element of q, 

6 ' is the first element of Q, 

6 " is a column vector obtained by eliminating the first element of Q, 
d' is the first element of d, 

d" is a column vector obtained by eliminating the first element of d. 


max 


Q + qd 

p + qQ 


: q € < 2 k+i > = max 


Q + q'd' + q"d" 
p + q'Q' + q"Q” 
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Note that 


max 


q + q'9' + q”9" 
p + q'O' + q" 0 » 


: q" £ J 2 k, q! £ [0,1] > = max max 


q"&£k 


q + q'9' + q"9" 
p + q'Q' + q"9" 


:q'e[0,l] . (72) 


For fixed q" £ using previous arguments, it can be shown that e f ^ q q igiXq"e" a uionotone 
function of q' £ [0,1]. Hence, 


max 


0 + q'9' + q"9" 


p + q'9' + q"9‘ 

It follows from (17T1) . d72l) and (1731) that 
e + q9 


: q £ [0,1] > = max 


g + q" 9" Q + 9' + q" 9" 


max 


p + q 6 


: q £ J 2 , 


fe+l 


= max max 
q"6 £k 


p + q" 9" p+9' + q"9" 

0 + q"9" 0 + 9' + q''9" 
p + q"9" ’ p + 6 ' + q" 6 " 


0 + q 9" 0 + 9 + q 9 

= max<; max -——, max -—-—— 

q"e£ k p + q"9" q"&£ k p + 9' + q"0" 


By the induction hypothesis, we have 

0 + q"9" 0 + q" 9" 

max -—— = max - 

q"c£ k p + q"9" q"&V k p + q 9 


0 + 9' + q"9" 0 + 9' + q"9" 

q“ a I fc p + 9' + q"0" = +<'t\ p + 9' + q" 6 " ' 


Therefore, 


max 


p + q9 
p + q9 


: q £ J 2 k+i f = uiax < max 


^ + q"9" 


max 


0 + 9' + q" 9" 


= max 


q"&Vk p + q"9" ’ q'+r k p + 9' + q" 9" 
Q + q9 

:q £ ?k +1 


(73) 


p + q9 

This completes the proof of the process of the mathematical induction. The proof of the lemma 
is thus completed. 

□ 


Lemma 16 IfK[M]a + C < E[Sm] < E[IW]/3 + p, then there exists q such that 0^ <q< 1 d and, 
that E[5m] = E [M]9 + 4 >, where 9 = q(a — f3) + (3 and 4> = q(C — p) + p- 


Proof. We consider the scalar case. The argument can be readily generalized to the vector case. 
If E[5 m] = E[IW]/3 + p, then the lemma holds with 9 = (3 and cj> = p. If E[£m] = E[M"]a + (, 
then the lemma holds with 9 = a and </> = (. Hence, it remains to prove this lemma under the 
assumption that 

E[M]a + C < E[5 m ] < E [M]j3 + p. (74) 

For this purpose, define 


9 q = q(a - f3) + /3, 4> q = q(( -p)+p 
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and 


w{q) = E[S m ] - E[M]0 g - cj) q 

for q G [0,1]. Note that w(q) = E[Sm] — E[M"][g(a — j3) + /3\ — [g(C — rj) + rj\ is a continuous 
function of q G [0,1]. Clearly, as a consequence of ([Till . 


iw(0) = E[S m ] - E[M]/3 -r)< 0, w(l) = E[5 M ] - E[M]a - C > 0. 

By virtue of the intermediate value theorem, there exists a number q* G (0,1) such that w(q*) = 0. 
This implies that 

E [S m ]=E[M]0,.+^., 


where 

=<f(a-/3) + /3, 4 > q * = g*(C - r?) + r? 

with g* G (0,1). This completes the proof of the lemma. 


□ 


We are now in a position to prove the theorem. Since the assumptions (I) - (VI) are fulfilled, 
it follows from Theorem [10] that E [M] < E[IV] < oo. Hence, E[£m] exists. Since C > 0 and 38 
contains (0,0^), it must be true that 

As + Bt < C 


for any (t, s) G 38. Hence, 
By Theorem [TJ we have 


Pr {ASm + BM <C} = 1. 
AK[S m ] + BE[M\ < C. 


By Lemma HU] we have E[M]a + £ < E[5m] £ E [M]f3 + rj. According to Lemma [T6l there exist 
9* = q*(a — /3) + j3 and (j)* = q*(( — rj) + q such that E[5 m] = 0*K[M] + (j)*. Hence, 


E [M](A0* + B) + A#* < C 


As a consequence of the assumption that the minimum of {B + A[/3 + q{a — /3)\ : q G 'f} is 
positive, we have 


min{A9 q + B : 0^ < q < 1^} > 0 
and thus A9* + B > 0. It follows that 


r n C ~ Aft 
E[ILf] < —-7^- < max 


B + AB* 

Invoking Lemma fTTTl we have 

C - A[r] + q(C - q)] 


C - A[q + q(C ~ V)} 


O d <q<l d 


max 


B + A[/3 + q{ot — (3)\ 

C - A[q + q (C - 77)] 


B + A[f3 + q(cx — 

This completes the proof of the theorem. 


■ 0 d < q <l d } = max 


B + A[(3 + q(a — /?)] 


:q£y 
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O Proof of Theorem \17 


Since Pr{a < X < b} = 1 and the assumptions (I) - (V) are fulfilled, it follows from similar 
arguments to that of Theorem fim that E[1VT] < E[1V] < oo. Hence, E[5 m] exists. Since C > 0 and 
contains (0, 0^), it must be true that As + Bt < C for any (t, s) € £%. Hence, Pr {ASm + BM < 
C} = 1. By Theorem [TJ we have 


AE[S m } + BE[M] < C. 

According to Lemma fl2l we have 

E[M]a + C < E[S m ] < E [M}/3 + rj, 

E[M]a' + C' < E[S m ] < E [M]j3' + rj'. 

The proof of the theorem can be completed by using similar arguments as that of Theorem [16] 


P Proof of Theorem [T8 

We need some preliminary results. 

Lemma 17 Assume that each element o/E[|X|] is finite. Then, E[7V] < oo. 


Proof. Since As + Bt = C , where C > 0, is the supporting hyperplane of the continuity region 
passing through the boundary point (m,m/i), it must be true that C = m(A T /i + B) > 0. By 
the definition that Z = BK + Xj, we have E[Z] = (Afi + B)K > 0. Note that 


E[\Z\\ = E 


K 


BK + A^Xi 


i=1 


I< 


< 


|A| ]T E[|Xj|] + \BK\ = K(\A\E[\X\] + |H|), 


1=1 


where the upper bound is finite as a consequence of the assumption that each element of E[|X|] is 
finite. Let Z\, Z 2 , • • • be i.i.d. random vectors having the same distribution as that of Z. Define 


r = inf < n € N : Zg> C 


(75) 


1 =1 


Making use of the fact that E[Z] > 0, E[|Z|] < 00 and assertion (i) of Theorem 3.1 in page 83 of 
Gut’s book [12], we have that 

E[t] < 00. 


Now define 

Af = inf {t € JT : AS t + Bt > C}, 


(76) 
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where = {n € N : -| € N}. Then, 


E[M] = KE[t\ < oo. 

Since C > 0 and the convex set 3& contains (0,0^), it must be true that As + Bt < C for all 
(t, s ) € 3%. Hence, 

{(n, S n ) e C {AS n + Bn < C} 

for n € N. This implies that N < A f. Hence, E[iV] < E[A/"] < oo. This completes the proof of 
the lemma. 

□ 


We are now in a position to prove the theorem. By the assumption that each element of 
E[|X| 2 ] is finite, it must be true that each element of E[|X|] is finite. Hence, it follows from 
Lemma H7I that E[AT] < E[A/*] < oo, where A f is defined by (1761) . Note that 

K 21 

E [Z 2 j = E BK + A^Xi 


= E 


= E 


< 


K 


aY,x, 

i=1 
K 


i=1 
21 


+ 2BKE 


K 


AT.*- 


i= 1 


+ (BKf 


2=1 

K 


+ 2 BK 2 A[i + (BKf 


J2 E[| I Xi 1 1|] + 2 BK 2 Ah + (BKf 
2=1 

= K\ |H| | 2 E[| \X \||] + 2 BK 2 Afi + {BKf. 


Using the above bound for E[Z 2 ] and the assumption that E[||X|||] is finite, we have that E[Z 2 ] < 
oo. Let Z\, Z 2 , ■ ■ ■ be i.i.d. samples of Z and define r as ([7511 . In the proof of Lemma 1T71 we have 
established that E[r] < 00. By Lorden’s inequality 


E 


X z '--c 


lt=i 


< E \{Z+f] 
~ E [Z\ 


Using Wald’s equation, we have 


and thus 


T 


E 




E[r]E[Z] 


E[r]E[Z] - C < 


E[(^ + ) 2 ] 

E [Z] ’ 
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from which we have 


Hence, 


c n(z + Y] 

^ - E[ZJ + E 2 [Z\ 


E[IV] < E[A/] = KE[t] 

CK t KE[{Z+) 2 } 

~ E[Z] + E 2 [Z] 

c + mz+) 2 } 


An + B K(A/i + B) 2 


= m + 


1 /m\ 2 


'E[(Z 


+ \2i 


K \C- 

where we have used the fact that E[Z] = K(A^l + B) and C = m(Afj, + B). Note that 
E \{Z + ) 2 } < E[Z 2 ] 

= E 


K 


= E 


bk + ^2 Ax i 

V i= 1 ) 

f 1 

BI\ + KA/j, + ^ A ( x - fi) 


= K 2 (Aji + B) 2 + E 


2=1 
/ K 




v. 2=1 


= K 2 (Afi + B) 2 + K E (A(X-fi)Y 


It follows from (1771) and (fTKl) that 


C 


E 


(A(X-riY 


E[N] - K+ AiYTB + (A^ + B ) 2 


= m + K + ( — ) E 


m 


( A (x-»)y 


Note that 


A(X-n) = (A' ,X — fj, 


(77) 


(78) 


(79) 


Since the absolute value of the inner product of two vectors is no greater than the product of 
their Euclidean norms, we have 


E 




< E 


\ AT 111 x 11^ ~ tA I 2 


= \\A t \\ 2 2 xE[\\X-^\\ 2 } 


Therefore, 


E [IV] < m + K + 


E 


{A{X-n)Y 


< m + K + 


2 2 xE[\\X-^\\ 2 ] 


| x E [||X — /x|||] . 


This establishes assertion (I) of the theorem. 
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If the elements of X are mutually independent, then E 
follows from this fact and (1791) that 


(MX ~ M )) 2 


E[IV] < m + K+ A 2 K[(X - n) 2 }. 


A 2 E[(X-/r) 2 ]. It 


This establishes assertion (II) of the theorem. 

It remains to show assertion (III). As a consequence of the definition of Z and the assumption 
that Pr{a < X < b} = 1, we have Ku < Z < Kv almost surely. It follows that 

(Z + ) 2 <Z 2 < ( K \f ~ [ KU ^ (Z - Ku) + (Ku f = K(u + v)Z - K 2 uv 
Kv — Ku 

almost surely. Hence, 


E [(Z+) 2 ] < K{u + u)E [Z\ - K 2 uv = K 2 (u + v){Afi + B) - K 2 uv. (80) 

Making use of (1771) and (1501) . we have 

E [(Z+) 2 ] 


E[AT] < 


C 


+ 


AjjL + B I<(Afi + Bf 


C 


Afi + B 


C 


Afi + B 


+ 


+ 


K 2 {u + v)(Afi + B) — K 2 


uv 


K(A/i + B) 2 

K(u + v){A[i + B) — Kuv 


C K(u + v) 


{A^i + B) 2 
Kuv 


A^i + B A/j, + B (A/j, + B ) 2 
mK(u + v) m 2 Kuv 
= m + - C -C5-’ 

where we have used the assumption that m(Afj, + B) = C > 0. This establishes the first inequality 
of assertion (III). 

To show the second inequality of assertion (III), note that 

(Z + ) 2 < k {KV) k ( Z ~ Ku ) = —{Z- Ku) 

Kv — Ku v — u 

almost surely for u < 0. Hence, 

K 2 v 2 

E[(Z + ) 2 ]< - (A/i + B-u), u< 0. 


v — u 


(81) 
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Making use of (1771) and (1HT1) . we have 


E[N] < 


C 


+ 


E [( Z + f 


An + B K(An + B ) 2 


c 


An + b 
c 


+ 


+ 


^{An + B-u) 
K(An + B ) 2 
Kv 2 An + B — u 


An + B v — u (An + B ) 2 
u 2 A/i + B — u 


= m + m^K- 


= m + 


v — u 

Kv 2 /m \ 2 




-«vcy 


- 


C 2 

c 


— u 


m 


for C > 0 > u. 

This completes the proof of the theorem. 


Q Proof of Theorem [19 

We need some preliminary results. 

Lemma 18 Let (m, mn), where m = g(n)> be a boundary point of the continuity region S%. Define 

A = -[v(n)] T , B = i-An, c = g (n). 

Then, As+Bt = C is the supporting hyperplane for & passing through the boundary point (m, mn)- 


Proof. Define function 

f(t,s) = t-g (|) 

for (f, s) G with f > 0. As a consequence of the definition of the function <?(.), it must be true 
that f(t, s) = 0 holds for any boundary point (t, s) of with t > 0. In particular, f(m, mn) = 0. 
Since g(v) is differentiable at a neighborhood of v = //, the function f(t,s ) is differentiable at 
a neighborhood of (t,s) = ( m,mu ). Since is convex, it must be true that the tangent plane 
to the surface f(t,s ) = 0 , passing through ( m,mn ), coincides with the supporting hyperplane of 
S%. Therefore, to show that As + Bt = C is the supporting hyperplane of & passing through 
the boundary point (m,m/r), it suffices to show that C is equal to m, and that A T and B are, 
respectively, equal to the partial derivatives of f(t, s) with respect to s and t when s = mn, t = m. 
In other words, it is sufficient to show that 


C = m, 


A t 


df(t,s ) 
ds 


t=m, s=171/1 


B = 


9f(t,s) 

dt 


t=m , s=m/i 
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Define 


h{v) = 


dg(v) 

dv 


Using the chain rule of differentiation, we have 

df(t, s 


and 


ds 

9f{t,s) 

dt 


= -!*( i 
t V t 


—'•I?) £■ 


Evaluating such derivatives with t = m, s = mg yields 


df(t,s) 


and 


ds 

df(t, s) 


h{g) 


= — V (g) = A 


T 


t=m , s=mfi 


dt 


9{g) 

= l+ ^g~y = l- A g = B . 


t=m , s=mii dig) 

Since the boundary point (m, mg) is in the supporting hyperplane, it must be true that 

C = A(mg) + Bm = m(Ag + B). 

Observing that Ag + B = 1, we have C = m. This completes the proof of the lemma. 


□ 


We are now in a position to prove the theorem. Making use of Lemma fTHl and assertion (I) of 
Theorem 1181 we have 


( vn \ ^ 
—J E 


A T ,X- g 


where A = — [V(/r)] T and C = m = g(g). Hence, 


E[iV] < g{g) + 1 + E ^(V (g),X — g) 

Using the fact that the absolute value of the inner product of two vectors is no greater than the 
product of their norms, we have 


(X(g) 1 X-g) 2 <\\X(g)\\ 2 2 x\\X-g\\l 


It follows that 


E[IV] < g(g) + 1 + E (X(g),X-g) 2 < g(g) + 1 + ||V(/r)||| X E[\\X - g\\ 2 ). 


This completes the proof of the theorem. 
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R Proof of Theorem [20 


Since the assumptions (I) - (VI) are fulfilled, it follows from Theorem 1101 that E[IV] < oo. Note 
that 

E[IV] =n t + (^+1 - N i) Pr i N > N e} ■ (82) 


£>t 

i+ie££ 


By the definitions of N,& n , p(n) and the convexity of we have 

{N>N t } c {{N e ,S Ne )e^} 
c {X N( G ( ^N f } 
c {\\X Ne -p\\ 2 >p(N,)} 

for £ > t. Note that 


Pr {N > Ni} < Pr{\\X N ,-fi\\ 2 >p(N £ )} 

= Pr{\\X N( -p,\\ 2 2 >[p(Ni)} 2 } 

k 

.fc=1 

< Pr^ \X k N ,~»k\ Z > 

d 




-fc |2 ^ 

d 


for some k among 1, • • • , d 


i=k 


< E Pr i^-^i 2 > 


\pmv 

d 


E[IV] <N r + E (N e+1 - N e ) E Pr 


£>t 

i+i^S£ 


Vd J 


(83) 

(84) 


for £ > t, where we have used the Pigeon-Hole principle in (1841) . Making use of (1821) and (1841) . we 
have 

7 k p(N e ) \ 

I '-Ne ~ dk E - 

k =1 ^ 


If V is a scalar random variable and p is less than the inhmum of then it follows from 
the definitions of N,£S n , p(n) and the convexity of £% that 

{N > Ni} C {(N e , Sn ( ) G £%} C {Xjv £ € Sfjv £ } E {Vw £ > p + p(Ni)} 

for £> t. Hence, 


Pr {AT > Ni} < Pr [X Nl > p + p(N e )} 
for £ > r. Making use of (1821) and ([85]) . we have 

E[IV] < Nr + ]T (N e+1 - Ni) Pr [X Nl >p + p(N e )} . 


( 85 ) 


£>t 

£+ie& 
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If X is a scalar random variable and p is greater than the supremum of c 4n t , then it follows 
from the definitions of N,& n , p(n) and the convexity of £% that 


{AT > Ng} C {(Ng, S N( ) G S%} C {X Nf G%}C < p - p(IV<)} 


for £ > t. Hence, 

Pr {AT > Ng} < Pr {X N( < p - p(Ng )} (86) 

for l > t. Making use of (l82l) and d86l) . we have 

E[N] < N T + J2 (N t+1 - Ng) Pr [X Nl <p- p{Ng)} . 

e>r 

t+ie& 

This completes the proof of the theorem. 


S Proof of Theorem I2T 


We need some preliminary results. 


Lemma 19 For n G N, (i £ M d , the minimum of \\s — nplW with respect to s G M. d subject to 
As + Bn = C is equal to K^+^)_TL _ 


Proof. For k = 1, • • • , d, let Sk, p k , a k denotes the k-th element of s, p and A, respectively. In 
other words, 

s = [si,--- ,s d ], p = \pi,- ■■ ,p d \, A = [«i, - - - ,a d ] T . 

For simplicity of notation, define D = C — Bn. Then, the problem of minimizing of ||s — nplW 
with respect to s G subject to As + Bn = C can be written as the problem of minimizing 
Ylt=i( s k — n Pk ) 2 with respect to si, • • ■ ,sj£R subject to 

d 

^ ^ O’kSk — D. 
k= 1 


We shall solve this problem by the Lagrange-multiplier method. Define 


/(£, si, ■ ■ ■ , s d ) = X( s fc “ n Fk) 2 + f X akSk ~ D ' 


k =1 


^fc=l 


Note that the partial derivatives 

df 


—— = 2(s k - np k ) + fa k , k = 1, - - - , d 
ds k 


df_ 
df 


d 

^2a k s k - D. 

k =1 
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(87) 






Setting -j^- = 0 yields 


i a k , , 

Sk = nfik -—, k = !,■■■ ,d. 


Substituting the above expression of Sk into (1H71) and setting = 0 yields 


( nfi k - ^ ) = D, 


k= 1 


i.e., 


n 


from which we have 


k =l 


£ = 2 




k =1 


^ ^2k =1 Q'kUk D 


Eti « 2 


Hence, 


(sfe - ^fe) 2 = 




^ Efc=l ^kPk D 


It follows that 




fc=i 


E d 

/c=i a fc / 

( n ELi Q fc^fc - £>) 2 

ELi a l 

[n{Ap + B) - C ] 2 

ZF ' 


a|, A; = !,■■■ ,d. 


This completes the proof of the lemma. 


□ 


We are now in a position to prove the theorem. According to Lemma [T9l we have 
p(n) = — y/min{||s — np ||| : As + Bn = C, s € 


1 /[n(A/r + H)-C] 2 

reV ZF 


1 . K^c-cy 


re 


AA T 


re 


1 /[(s-OT s 




n 


1 [(£-l)m(A/r + J B)] 2 




1 /[(n — m){An + H)] 2 


re V AA T 

(l-f)|A/z + £?| 

v / ZF 
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for n > m. Hence 


pm = 


(1-$)\Avl + B\ 


= Tl 


\/Z4 t 

for t > r. Making use of (fKHl) and (fI7|) of Theorem [20l we have 
In the case that X is a scalar random variable, we have that 


m 


pm = ii=[ i-^) 


B 

^ + A 


( 88 ) 


(89) 


for £ > t. As a consequence of (|HU1) and (fT%D . (HTTP of Theorem 12U1 we have (T2TD and 021) . This 
completes the proof of the theorem. 


T Proof of Theorem [24 

We need some preliminary results. By a similar argument as that for proving Lemma 0 we can 
show the following result. 

Lemma 20 There exist A, B and a positive number C such that As + Bt = C is a supporting 
hyperplane of S%, which passes through (r, r/i). 

Define 

T = inf{f > 0 : AW t + Bt > C}. 

By a similar argument as that for proving Lemma [TJ we can show the following result. 

Lemma 21 There exist 5 > 0 and T > 0 such that {T > t} C {||Wt — p \\2 > 5} for t greater 
than T. 


The following result is well known (see, e.g., |25l page 55]). 


Lemma 22 Let Bt be a scalar Brownian motion with zero drift and unity diffusion. Then, 
Pr / sup |-B S | > a! < 2 exp f - , A > 0, t > 0. 


lo<s<4 


2t) 


With the above two lemmas, we can show the boundedness of the average stopping time as 
stated as follows. 


Lemma 23 E [T] < oo. 
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Proof. Let E denote the diffusion matrix of Wt, i.e., E [(Wt — tfi) T (Wt — £/x)] = E. Then, we 
can write 

Wt = t/x + B t T, T , 

where B t is a standard Brownian motion with drift vector 0 f ; and identity diffusion matrix. Note 
that 

Pr{||fL t — n\\ 2 > 5} = Pr{||Wt — f/x|| 2 > t5} 

= Pr{||fl t E T || 2 >t<5} 

= Pr{||StS T ||§>(^) 2 } 

= Pr{B f E T EB7 > ( tS ) 2 }. 

Since E T E is a positive semidehnite matrix, there exists an orthogonal matrix U and a diagonal 
matrix A with diagonal elements Ai > A 2 > • • • > A^ > 0 such that 

e t e = uau t . 


Hence, 

BtT^T.B] < Ai B t UU T Bj = X\B t Bj 

and it follows that 

Pr{| |Wi - /i|| 2 >S}< Pi{XiB t Bj > (; tS ) 2 }. 

If Ai = 0, then Pr{||VPj — /x|| 2 > <5} = 0 for all t > 0. It follows from Lemmal2Tlthat Pr{T > t} = 0 
for t greater than T. Consequently, E[T] < T < 00 . Hence, it remains to show E[7"] < 00 with 
Ai > 0. In this case, we have 

Pr{|F* - Hb > < Pr \B t Bj > ^ j . 

Let denote the L-th component of B t , i.e., B t = [Bj , • • ■ ,Bf]. Then, B t Bj = Ylt =1 \^t\ 2 
and 

Pr{||W t -/i|| 2 >6} < 

< 
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Pr\j2\B?\ 2 > 




,/c =1 


Pr<i |Hf | 2 > 


w 

dX] 


for some k among 1 , • • • ,d 


k =1 
d 


E Pr ii^'i 2> 


w 

dX] 

tS 


Pr < \B*\ > -= 

k=1 l \fdX\ 

dF ^ lBll> 7 k}' 


< 








where Bt is a scalar Brownian motion with zero drift and unity diffusion. Making use of Lemma 
we have 


Pr I B t \ > 


ts 


VdXi 


< 2 exp 


V 


V VdXi J 

~2t 


= 2 exp — 


tb 2 

2 d\[ 


Hence, 


Pr{||PL t - /i|| 2 > 5} < 2dexp - 


ts 2 

2dA i 


t5 2 \ 


for t > 0. Invoking Lemma [2T1 we have that 

Pr {T >t}< Pr{|| W t - n\\ 2 > d} < 2d exp j 

for t greater than T. If follows that 

r oo 

E[7~] = / Pr{T>t}dt 

Jt =o 

r oo 

< T + / Pr{T>t}dt 

Jt =T 


Z " 00 / td 2 

< T + / 2d exp (-— 

Jt =T \ 2dAi 


dt < oo. 


This completes the proof of the lemma. 


□ 


Lemma 24 E[7~] = r. 

Proof. From Lemma 1231 we have E[7~] < oo. Making use of Wald’s equation, we have 

E[W r ] = E[7l/r. (90) 

By the definition of T, we have 

HITr + BT = C 

almost surely. Taking expectation on both sides of the above equation yields 

AE[W r ] + BE[T] = C. (91) 

Combining (l90l) and (l9l|) yields 

(A/j, + B)E[T\ = C. (92) 

From Lemma 1201 we know that there exist A, B and a positive number C such that As + Bt = C 
is a supporting hyperplane of which passes through (r, t/a), where r > 0. Hence, 

A(t/i) + Bt = C > 0 
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and it follows that A/i + B > 0. Dividing both sides of (1921) by A/j, + B yields 


E[T] 


C 

An + B 


A(rfi) + Bt 
Afx + B 


□ 


We are now in a position to prove the theorem. Since C > 0 and (0, 0^) is contained by 
the convex set it follows from Lemma 1201 that As + Bt < C for all (t,s) € This implies 
that T <T and thus E[T] < E [T]. Finally, using Lemma [Ml we have E[T] < E[T] < r. This 
completes the proof of the theorem. 


U Proof of Theorem [25 

We shall first show E[T] > ming/ under the assumption that $4 is not empty. If E[T] = oo, 
then E[T] > inf stf trivially holds. If E[T] < oo, then Pr{T < oo} = 1 and it follows that Wt is 
well-dehned and 

Pr{(T,Wr) G^} = 1. 

According to Theorem Q} we have 

(E[T], E [W T \) G M. 

Since E[T] < oo, it follows from Wald’s equation that E[Wt] = E[T]/r. Hence, 

(E[T], E [T)n) €^, 

which immediately implies supjz/ > E[T] > inf stf. This establishes assertions (I) and (II). 

It remains to show that E[T] = oo under the assumption that srf is empty. We use a contra¬ 
diction method. Suppose that E[T] < oo, then Pr{T < oo} = 1 and it follows that 

Pr{(T, Wt) G £%} = 1. 


According to Theorem Q} we have 

(E[T], E[W t }) 

Since E[T] < oo, it follows from Wald’s equation that E[H/V] = E [T]n- Hence, 

(E[T], E [T]n) G 

which immediately implies that sf is not empty. This is a contradiction. Therefore, it must be 
true that E[T] = oo if srf is empty. The proof of the theorem is thus completed. 
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V Proof of Theorem [26 


We need some preliminary results. 

Lemma 25 E[T] < oo. 

Proof. Note that there exists a number 5 > 0 such that {9 £ : \\9 — /r,|I 2 < <5} C M d . Since 

g is a concave function on M d , it must be a continuous function on M d . By the bounded-value 
theorem, we have that there exists a positive number T such that g(9) < T for any 9 contained 
in the set {9 € M d : ||# — /u|I 2 < <5}. This implies that 

{T>t}C{\\W t -fi\\ 2 >5} 

for any t greater than T. Hence, by the same argument as that of Lemma 1231 we can show that 
E[T] < 00 . 

□ 


We are now in a position to prove the theorem. From Lemma 1251 we have E[T] < oo. Hence, 
it follows from Wald’s equation that E[HV] = E [T ]//. By the definition of the stopping time T, 
we have 

T = g (^ 


almost surely. Multiplying both sides of the above inequality by T yields 

W T 


T' = Tg(^- 


alrnost surely. Taking expectation on both sides of the above inequality and using Jensen’s 
inequality yields 

E 2 [T] < E[T 2 ] = E Tg (— ) . (93) 


Since T is a positive random variable and g is a concave function, it follows from Theorem [3] that 


E 




< E[T] 9 


E [W T \ \ 
E [T] ) 


(94) 


Combining (l93l) and (|M[) yields 


E 2 [T] < E [T\g 


E[Wt] \ 
E[T] ) ’ 


which implies 


. ms . (»)-.(») 


This completes the proof of the theorem. 
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W Proof of Theorem [27 


If E[T] = oo, then the conclusion of the theorem holds trivially. So, it suffices to show the theorem 
under the assumption that E [T] < oo. Since E[T] is bounded, the Wald’s equation E[Wt] = E [T]/a 
holds. As a consequence of the dehnition of the stopping time T and the fact that Pr{T < 00 } = 1, 
we have 


g (w T y 


g{W t) > 0 


almost surely, which implies that 


Tg{W T ) = 1 


almost surely. 


Taking expectation on both sides of the above equation yields 


E [Tg(W T )\ = 1, 


or equivalently, 


E 


Tg 


W T 

~T~ 


= 1 . 


Since T is a positive random variable and g is a concave function, it follows from Theorem [3] that 


1 = E 


Tg 


W T \ 

— 


< W\g 


E[IU T ] \ 
E[T] ) ' 


(95) 


Applying the Wald’s equation E[Wt] = E[T]/i to (f95|) yields 


lsE i r i 9 (w)= E i T i 9 (wf) = E ™'‘ ) ' 

Since g(g) > 0, we can conclude from the inequality 1 < E [T]g(n) that E[T] > . This 

completes the proof of the theorem. 
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